Scalable Computation of Shapley Additive Explanations

Louis Le Page

Christina Dionysio

Matthias Boehm

2025

Abstract: The growing field of explainable AI (XAI) develops methods that help better understand ML model predictions. While SHapley Additive exPlanations (SHAP) is a widely-used, model-agnostic method for explaining predictions, its use comes with a substantial computational burden, particularly for complex models and large datasets with many features. The key - and so far unaddressed - challenge lies in efficiently scaling these computations without compromising accuracy. In this paper, we present a scalable, model-agnostic SHAP sampling framework on top of Apache SystemDS. We leverage Antithetic Permutation Sampling for its efficiency and optimization potential, and we devise a carefully vectorized and parallelized implementation for local and distributed operations. Compared with the state-of-the-art Python SHAP package, our solutions yield similar accuracy but achieve substantial speedups of up to 14× for multi-threaded singlenode operations as well as up to 35× for distributed Spark operations (on a small 8 node cluster).

Keywords: Explainable AI, SHAP Values, Interpretability, Parallelization, Vectorization, Antithetic Permutation Sampling

https://mboehm7.github.io/resources/btw2025c.pdf

BIFOLD AUTHORS

Prof. Dr. Matthias Böhm