Banner Banner

Explaining the Decisions of Convolutional and Recurrent Neural Networks. Mathematical Aspects of Deep Learning.

Wojciech Samek
Leila Arras
Ahmed Osman
Grégoire Montavon
Klaus-Robert Müller

November 01, 2022

The ability to explain and understand the prediction behaviour of complex machine learning (ML) models such as deep neural networks is of large interest to developers, users and researchers. It allows them to verify the system’s decision making and gain new insights into the data and the model, including the detection of its malfunctioning. Moreover, it can also help to improve the overall training process, e.g., by removing detected biases. However, due to the large complexity and highly nested structure of deep neural networks, it is non-trivial to obtain these interpretations for most of today’s models. This chapter describes Layer-wise Relevance Propagation (LRP), a propagation-based explanation technique that can explain the decisions of a variety of ML models, including state-of-the-art convolutional and recurrent neural networks. As the name suggests, LRP implements a propagation mechanism that redistributes the prediction outcome from the output to the input, layer by layer through the network. Mathematically, the LRP algorithm can be embedded into the framework of Deep Taylor Decomposition and the propagation process can be interpreted as a succession of firstorder Taylor expansions performed locally at each neuron. The result of the LRP computation is a heatmap visualizing how much each input variable (e.g., pixel) has contributed to the prediction. This chapter will discuss the algorithmic and theoretical underpinnings of LRP, apply the method to a complex model trained for the task of Visual Question Answering (VQA), and demonstrate that it produces meaningful explanations, revealing interesting details about the model’s reasoning. We conclude the chapter by commenting on the general limitations of current explanation techniques and interesting future directions.