What question did this study set out to answer?

March 14, 2026

Radiomics-Based Machine Learning Models for Classifying Breast Cancer on Dynamic-Contrast Enhanced MRI through Multi-Observer Analysis

Key Points

This research aims to evaluate the effectiveness of radiomic features and machine learning in classifying breast cancer.
Manual segmentation of DCE-MRI images by four radiologists
Extraction of 107 radiomic features from each lesion
Feature selection using LASSO regression with cross-validation
Evaluation of nine machine learning models for classification
CatBoost achieved the highest AUC of 0.937 with high sensitivity and specificity
Random Forest and Naïve Bayes also performed well but were less effective than CatBoost
ICC values indicated excellent reliability among radiomic feature categories

Abstract

Introduction: Breast cancer remains a significant global health issue, necessitating rapid alternative diagnostic methods to improve survival rates. This study aimed to evaluate observer performance in the manual segmentation of Dynamic Contrast-Enhanced MRI (DCE-MRI) images and to assess the effectiveness of radiomic features and machine learning (ML) in classifying benign and malignant breast cancer. Methods: Breast lesions from 155 patients (65 benign, 90 malignant) were manually segmented on DCE-MRI images by four experienced radiologists using 3D Slicer (version 5.6.1). From each lesion, 107 radiomic features, including shape, first-order, and texture features, were extracted, yielding a high-dimensional dataset. All features were normalized using Z-score scaling. Feature selection was performed using LASSO regression with fivefold cross-validation. The dataset was divided into training and testing sets in a 70:30 ratio, and model performance was evaluated using five-fold cross-validation. The top 20 radiomic features were selected based on intraclass correlation coefficient (ICC) analysis to ensure feature stability. Nine machine learning models, CatBoost, Random Forest, XGBoost, AdaBoost, Naïve Bayes, Logistic Regression, k-NN, SVC, and MLP were employed for classifications. Hyperparameter tuning was applied to optimize model performance, and SHapley Additive exPlanations (SHAP) were used to identify key predictive features. Results: ICC values ranged from 0.941 to 0.992 (95% CI), demonstrating excellent reliability across all radiomic feature categories. CatBoost outperformed the others with an AUC of 0.937 (95%CI:0.852-0.993) with a sensitivity of 0.889 and a specificity of 0.909 in the internal test set. Other models, such as Random Forest (AUC:0.881, 95%CI:0.758-0.972) and Naïve Bayes (AUC:0.843,95%CI:0.707-0.949), performed well but were less effective compared to CatBoost. SHAP analysis showed that several radiomic features were significant in distinguishing malignant lesions. Discussion: Ensemble-based models generally outperformed traditional classifiers, such as Logistic Regression and k-NN, possibly because they can capture non-linear relationships in the dataset. SHAP analysis provided insight into model interpretability by identifying key features that contributed most significantly to the classification task. Conclusion: This study demonstrates the potential of integrating radiomic features with ML for breast cancer classification. CatBoost exhibited the highest predictive performance, highlighting its effectiveness in distinguishing malignant from benign lesions.

Demander à l'IA

Bookmark

Demander à l'IA

Bookmark

Radiomics-Based Machine Learning Models for Classifying Breast Cancer on Dynamic-Contrast Enhanced MRI through Multi-Observer Analysis

Key Points

Abstract

Cite This Study