Classifying short message service (SMS) spam is critical for identifying unauthorized and potentially harmful messages, especially given the increasing number of crimes associated with such communications. This study compares the effectiveness of Large Language Models (LLMs) with traditional machine-learning techniques in spam SMS classification. The results demonstrate that LLMs outperform commonly used traditional methods, including Support Vector Machine (SVM), Decision Tree (DT), and Naïve Bayes (NB), setting this research apart from prior work. To ensure robust evaluation, this study utilizes a comprehensive dataset comprising diverse SMS spam samples alongside preprocessing techniques such as tokenization, case transformation, and stopword filtering (in English). Three LLM models—Phi-3.5 Classifier, H2O-Danube, and DistilBERT—were fine-tuned to optimize performance. Experimental results revealed that the Phi-3.5 Classifier and H2O-Danube achieved identical performance metrics of accuracy, precision, recall, and F1-scores with 99%. The DistilBERT model also performed exceptionally well, achieving 99% across these metrics. These results significantly surpass those obtained from traditional machine learning models, highlighting the superior accuracy of LLMs in spam classification. The findings have profound implications for integrating LLM Models to enhance the performance of sentiment analysis, improve spam detection systems, compare and establish performance benchmarks by leveraging LLMs for sentiment analysis in SMS spam detection, which can enhance SMS communication security, and increasing the overall efficiency of spam mitigation strategies.
Li et al. (Thu,) studied this question.