This study evaluates feature extraction techniques and classifiers for detecting SMS spam. We compared six classifiers: Naive Bayes, K-Nearest Neighbors, Support Vector Machines, Linear Discriminant Analysis, Decision Trees, and Deep Neural Networks, using both bag-of-words and TF-IDF. Results show TF-IDF consistently outperforms bag-of-words, with Naive Bayes achieving the highest accuracy (96.2%) and strong precision for non-spam (0.976). Support Vector Machines (94.5% accuracy) and Deep Neural Networks (91.0% accuracy) also performed well. In contrast, K-Nearest Neighbours, Linear Discriminant Analysis, and Decision Trees were less effective. Findings highlight TF-IDF with Naive Bayes, SVMs, or DNNs as optimal for spam detection.
Ahmadi et al. (Thu,) studied this question.