New Whitepaper on XAI

Explainable AI

Who should be told what, how and for what purpose?

AI systems with high complexity are often referred to as "black boxes". It is frequently difficult to understand how they arrive at their results. However, in many areas of application—such as medical diagnostics, credit approval in the financial sector, or quality control in manufacturing—transparency is crucial to properly interpret and question outcomes. Developers gain insights to improve AI systems, and users can learn which factors influenced a bank’s credit decision, for example. These cases show that explanations of AI decisions must be understandable to different target groups.

Against this backdrop, experts from the “Technological Enablers and Data Science” working group of the Plattform Lernende Systeme address the topic of “Explainable AI” in this white paper, guided by the question: Who should be told what, how and for what purpose? Plattform Lernende Systeme (PLS) is a network of experts in the field of Artificial Intelligence (AI). It brings together specialized knowledge and, as an independent facilitator, promotes interdisciplinary exchange and public dialogue.

First author of the white paper is BIFOLD Fellow Prof. Dr. Wojciech Samek: "The development of explainability in AI models (XAI) can roughly be divided into three waves, each with different focuses and objectives. In the initial phase, the emphasis was on making individual model decisions understandable. The goal was to visualize how strongly various input dimensions—such as individual pixels in an image—contributed to a model’s prediction. A key method from this phase is Layer-wise Relevance Propagation (LRP). This technique is based on the idea of distributing the prediction backward through the network. Neurons that contributed more to the decision receive a proportionally higher share of the overall relevance. The relevance values assigned to each pixel of the input image indicate which areas of the image were crucial for the AI's decision.
The second wave of explainability research aimed to better understand the AI model itself. Using methods like Activation Maximization, it is possible to show which features individual neurons encode. The Concept Relevance Propagation (CRP) method expands on this type of explanation and allows for the analysis of the role and function of individual neurons in model decisions. These methods from the second wave of XAI form the foundation of the emerging field of mechanistic interpretability, which analyzes functional sub-networks ("circuits") within the model.
The third wave, driven by the latest methods in XAI research, seeks a systematic understanding of the model, its behavior, and its internal representations. Methods such as Semantic Lens aim to understand the function and quality of each individual component (neuron) in the model. This holistic understanding enables systematic and automatable model evaluations—for example, checking whether a skin cancer model truly follows the German medical ABCDE rule."

Using seven different personas—from an AI specialist to a basic end-user—the white paper illustrates how diverse individual characteristics (such as prior knowledge and goals), along with explanation formats and methods, can vary widely.

To advance the development of this AI technology and fully realize its potential, the paper proposes both general and target group-specific design options. For example, research should enhance established methods and further develop explainable AI (XAI) techniques for emerging types of AI. Possible innovations include tools to inspect and control large AI models or standard toolkits for correcting models without retraining them. In academia, XAI should be more firmly integrated into AI and data science degree programs as an engineering tool within the broader field of AI engineering. Businesses could increasingly adopt XAI, for example, to reduce internal communication barriers and to differentiate themselves from competitors.

The publication in detail (in German only): Samek, W., Schmid, U. et al. (2025): Nachvollziehbare KI: Erklären, für wen, was und wofür.