Category Theory Offers Path to Interpretable Artificial Intelligence, Quantinuum Scientists Report

Insider Brief

Artificial Intelligence (AI) use has entered the mainstream yet the lack of interpretability in these systems remains a critical issue.
How AI systems arrive at answers and solutions is often opaque, which poses significant accountability challenges.
Quantinuum scientists have proposed a novel paradigm shift in AI interpretability, leaning heavily into the principles of category theory.
Image: An AI-based treatment of this research that without question lacks interpretability.

Artificial Intelligence (AI) has seen a massive surge in applications, yet the lack of interpretability in these systems remains a critical issue and represents a growing concern. The intricate workings of neural networks, like those powering large language models (LLMs), such as ChatGPT or Claude, are often opaque, posing significant challenges in fields requiring accountability, such as finance, healthcare and legal sectors.

These opaque AI recommendations may initially sound technical and largely academic, but they have real-world implications. A lack of interpretability could jeopardize patient safety, for example, while in finance, non-transparent AI credit scoring could lead to biases and regulatory compliance issues.

With quantum AI on the horizon, interpretability may become an even bigger issue, as the complexity of quantum models could further obscure the decision-making processes without robust frameworks.

Reporting in a company blog post and a scholarly paper on the pre-print server ArXiv, researchers at Quantinuum — which includes a bench of highly esteemed AI scientists — have proposed a novel paradigm shift in AI interpretability, leveraging principles from quantum mechanics and category theory.

Ilyas Khan, Quantinuum Founder and Chief Product Officer, who also served as an author on the paper, offers context for the work in a LinkedIn post: “When we make a car, or a plane, or even a pair of scissors, we understand how they work, component by component. When things don’t work, we know why they don’t work, or we can find out, systematically. We also build systems in critical areas such as healthcare where we know how inputs and outputs are related. This simple construct is required if AI and LLM’s, remarkable and impressive as their progress has been, are to truly be deployed across society for our benefit. This problem – this lack of ‘safety’ is not a new concern. In fact “XAI” as a movement (‘explainable AI’) has been born out of this concern. In our new paper we take a rigorous and very detailed look at how (and why) AI systems need to be ‘explainable’ and ‘interpretable’ by design, and not by some obscure costly and partially effective post-hoc method.”

The paper is quite detailed, but the following is a break down that will hopefully summarize the team’s major insights.

The Interpretability Problem

The core issue with many current AI models is their “black-box” nature, according to the researchers. These models, particularly deep learning neural networks, excel at tasks but provide little insight into their decision-making processes. This opacity is a significant drawback in high-stakes domains where understanding how conclusions are drawn is paramount for ensuring safety and ethical use.

Explainable AI (XAI) has emerged as a response to this problem. XAI employs “post-hoc” techniques, which attempt to elucidate the behavior of pre-trained models. Methods such as saliency maps, Shapley values and counterfactual explanations are used to make these models more transparent. However, these techniques often provide approximate and sometimes unreliable explanations.

A New Approach with Category Theory

Quantinuum’s researchers have introduced a fresh perspective on AI interpretability by applying category theory, a mathematical framework that describes processes and their compositions. Category theory is used today in fields, such as computer science where it helps design and understand software through languages like Haskell, and in mathematics where the theory guides the connection of different ideas by showing how they relate to each other. It also helps physicists model and understand complex systems in areas like quantum mechanics.

This category theory approach is detailed in depth in the recent arXiv paper, where the team presents a comprehensive theoretical framework for defining AI models and analyzing their interpretability.

The researchers write in their blog post: “At Quantinuum, we’re continuing work to develop new paradigms in AI while also working to sharpen theoretical and foundational tools that allow us all to assess the interpretability of a given model. With this framework, we show how advantageous it is for an AI model to have explicit and meaningful compositional structure.”

At the heart of Quantinuum’s approach is the concept of compositional models. These models are designed with explicit and meaningful structures from the outset, making them inherently interpretable. Unlike traditional neural networks, compositional models allow for a clear understanding of how different components interact and contribute to the overall decision-making process.

Sean Tull, a co-author of the paper, offered his take on the significance of this development in the post: “In the best case, such intrinsically interpretable models would no longer even require XAI methods, serving instead as their own explanation, and one of a deeper kind.”

Using category theory, the researchers developed a graphical calculus that captures the compositional structure of AI models. This method not only paves the way for the interpretation of classical models but also extends to quantum models. The team writes that the approach provides a precise, mathematically defined framework for assessing the interpretability of AI systems.

Practical Implications

The implications of this research, if it holds, are far-reaching and profound. For example, while transformers, which are integral to models like ChatGPT, are found to be non-interpretable, simpler models like linear models and decision trees are inherently interpretable. In other words, by defining and analyzing the compositional structure of AI models, Quantinuum’s framework enables the development of systems that are interpretable by design.

To see how this might impact real-world usage of AI, it’s possible that the research could give developers a better handle on the constant problems for people who use LLMs: the errant “hallucinations” of these models. During these hallucinations, the AI produces incorrect — often wildly so — information. By applying category theory to develop inherently interpretable AI models, researchers can better understand and control the decision-making processes of these models. This improved interpretability can help identify and mitigate instances where LLMs generate incorrect or nonsensical information, thereby reducing the occurrence of hallucinations.

The use of category theory and string diagrams offers several forms of diagrammatic explanations for model behavior. These explanations, which include influence constraints and graphical equations, provide a deeper understanding of AI systems, enhancing their transparency and reliability.

The researchers wrote in their blog post, “A fundamental problem in the field of XAI has been that many terms have not been rigorously defined; making it difficult to study – let alone discuss – interpretability in AI. Our paper marks the first time a framework for assessing the compositional interpretability of AI models has been developed.”

Future Directions

Quantinuum’s approach paves the way for further exploration into compositional models and their applications. The team envisions a future where AI models are not only powerful but also transparent and accountable. Their ongoing research aims to refine these theoretical tools and apply them to both classical and quantum AI systems, ultimately leading to safer and more trustworthy AI applications.

The researchers emphasized in their blog post, “This work is part of our broader AI strategy, which includes using AI to improve quantum computing, using quantum computers to improve AI, and – in this case – using the tools of category theory and compositionality to help us better understand AI.”

In addition to Tull and Khan, the Quantinuum team includes: Robin Lorenz, Stephen Clark and Bob Coecke

For a more in-depth and technical dive into the research, you can access the full paper on arXiv here.