Key points are not available for this paper at this time.
Malware threats have been increasing very rapidly in today's world. Since everyone uses the internet in the modern world, users are more vulnerable to cyberattacks. Traditional methods are not capable of finding the malware accurately so this research focuses on building an effective solution using Machine Learning to detect malicious software and interpret the model using Explainable AI to understand why a certain decision was made by the ML classifier. The dataset is preprocessed in several steps which include Min-Max Scalar for Feature Scaling, Minimum Redundancy Maximum Relevancy (MRMR) for Feature Selection, and Principal Component Analysis (PCA) for Dimensionality Reduction. Then, the algorithms are trained using multiple machine learning algorithms such as Naive Bayes, AdaBoost, Logistic Regression, Decision Tree, Random Forest, Long Short-Term Memory (LSTM) and XGBoost. Among all the classifiers, we achieved the highest accuracy of 99.825% with XGBoost. Additionally, for model interpretability, we have used Explainable AI methods such as LIME and SHAP, to understand which feature was responsible for the instance being Malware or Benign.
Masud et al. (Fri,) studied this question.