Detection of breast cancer remains an important global health issue, particularly in patients with dense breast tissue. Despite wide use of machine learning for breast cancer diagnosis, many existing studies rely on single-model evaluations and lack rigorous statistical validation, while transformer-based models remain insufficiently explored for structured tabular biomedical data. Using the Wisconsin Breast Cancer Diagnostic (WBCD) dataset, this study presents a comparative analysis of machine learning, deep learning, and transformer-based models for binary classification of breast masses as benign or malignant. A robust preprocessing and feature engineering pipeline was applied, and Support Vector Machine (SVM), Random Forest (RF), Multi-Layer Perceptron (MLP), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and BERT-based models were evaluated using a stratified 66–34% train–test split and stratified 5-fold cross-validation. Performance was assessed using standard classification metrics, Jaccard Index, AUC-ROC, false negative rate, and statistical significance testing (paired t-tests and Bayesian analysis). Three-layer RNN and LSTM models achieved the highest accuracy of 98.25%, showing statistically significant improvement over SVM (p = 0.042), while the optimized SVM demonstrated strong performance with superior computational efficiency. BERT-based models yielded lower accuracy (92.98%), reflecting domain mismatch between language-pretrained transformers and numerical tabular data. These findings indicate that deeper recurrent architectures provide superior diagnostic performance, while SVM remains a practical and efficient alternative for clinical deployment.
Javed et al. (Sun,) studied this question.