Improving Malware Detection with Explainable Machine Learning

Michele Scalas

Konrad Rieck

Giorgio Giacinto

March 17, 2023

Machine learning is used for addressing several detection and classification tasks in cybersecurity. Typically, detectors are modeled through complex learning algorithms that employ a wide variety of features, which range from low-level machine code to statistical measures. Although these models allow achieving considerable performances, gaining insights on the learned knowledge turns out to be a hard task. These insights would help to capture the essential malicious components of a modern attack, which is usually hidden and obfuscated under potentially-legitimate sequences of instructions. These challenges can be addressed by employing explainable machine learning. In particular, explanations can help human experts to develop novel approaches for the static and dynamic analysis of applications by focusing on the distinctive features that characterize malware. In this perspective, we focus on such challenges and the potential uses of explainability techniques in the context of Android ransomware, which represents a serious threat for mobile platforms. We present an approach that enables the identification of the most influential features and the analysis of ransomware. We point out how explanations can be used to answer different questions depending on specific aspects, such as the considered explanation baselines. Our results suggest that our proposal can help cyber threat intelligence teams in the early detection of new ransomware families and could be extended to other types of malware.

https://doi.org/10.1016/B978-0-32-396098-4.00017-X

BIFOLD AUTHORS

Prof. Dr. Konrad Rieck