Explanation-Aware Attacks and the Limits of using XAI

June 16, 2026 14:00 - 15:00

BIFOLD 7th. Floor, Room 701, Franklinstr. 28/29, 10587 Berlin

Christian Wressnegger

Modern (deep) learning methods used to lack understandable reasoning in their decision process, making crucial decisions less trustworthy. The research field of "Explainable AI" (XAI) has turned the tables, though, enabling precise relevance attribution of input features for otherwise opaque models. Among many prospects in computer science, this progression has also raised expectations that these techniques can also benefit defense against attacks on computer systems and even machine learning models themselves. However, so-called explanation-aware attacks allow an adversary to manipulate an ML model's decision and the output of XAI techniques simultaneously, questioning the applicability of AI/ML in safety/security-critical applications. This talk explores the prospects and limits of XAI, demonstrating where it can and cannot (yet) be used reliably.

Bio

Christian Wressnegger is a a Professor of Computer Science at Karlsruhe Institute of Technology (KIT) heading the "Artificial Intelligence and Security" research group. Additionally, he serves as the spokesperson for the "KIT Graduate School Cyber Security," is the co-director of the "KASTEL Security Research Labs," one of three competence centers for cybersecurity in Germany, and is a Junior Fellow of the Berlin Institute for the Foundations of Learning and Data (BIFOLD). He holds a Ph.D. from TU Braunschweig and has graduated from Graz University of Technology, where he majored in computer science. His research revolves around combining AI/ML and computer security.