[PDF]

Explainable AI-based Framework for Malware Detection


Aniket Arjun Kangane

05/09/2024

Supervised by Tingting Li; Moderated by Alexia Zoumpoulaki

With the increasing complexity and frequency of cyberattacks, the use of AI-based solutions for malware detection has become more prevalent. However, these solutions often rely on black box models that are difficult to trust and understand, making it challenging to provide accountability and transparency.

The project will use machine learning algorithms to develop an AI-based framework for malware detection. The framework will be trained on a dataset of known malware samples using features extracted from the samples. The project will explore various methods for explaining the most salient features used in classifying a malware sample into a specific class, including methods such as LIME. The project will also explore methods for explaining misclassified malware samples, including examining the differences in features between correctly classified and misclassified samples.

Expected Objectives: - (Essential) Conduct a literature review to identify existing research on explainable malware detection and highlight gaps in the current literature. - (Essential) Develop methods to identify the most salient features used in classifying malware samples into specific classes, such as LIME or SHAP on the trained ML models. - (Desirable) Explore methods for explaining misclassified malware samples, including examining the differences in features between correctly classified and misclassified samples.

Conclusion: This project will contribute to the development of explainable and accountable AI-based solutions for malware detection. The framework developed in this project will help to improve the trust and understanding of AI-based solutions in the cybersecurity industry. The findings of this project will also provide guidance for future research in explainable AI-based cybersecurity solutions.

[1] Daniel Arp, Michael Spreitzenbarth, Malte Huebner, Hugo Gascon, and Konrad Rieck "Drebin: Efficient and Explainable Detection of Android Malware in Your Pocket", 21th Annual Network and Distributed System Security Symposium (NDSS), February 2014 [2] Lime: Explaining the predictions of any machine learning classifier, https://github.com/marcotcr/lime


Final Report (05/09/2024) [Zip Archive]

Publication Form