What question did this study set out to answer?

The aim is to develop an optimized Random Forest algorithm for accurately diagnosing breast cancer.

April 21, 2026Open Access

Development and Optimization of Random Forest Algorithm for Breast Cancer Diagnosis

Key Points

The aim is to develop an optimized Random Forest algorithm for accurately diagnosing breast cancer.
Utilized grid search to fine-tune random forest hyperparameters.
Set hyperparameters for the model including trees, depth, and samples for splitting and leaves.
Trained the model on breast cancer data using the optimized parameters.
Achieved 99.12% accuracy in classifying breast cancer as benign or malignant.
Surpassed previous results and established algorithms in terms of accuracy.
Demonstrated robust performance in identifying patterns within breast cancer datasets.

Abstract

Breast cancer represents a pervasive global health challenge, often manifesting as small masses or tumors within breast tissue, commonly found in milk ducts and glands. The global incidence of breast cancer, especially among women, has been steadily increasing. Timely detection plays a pivotal role in successful treatment and preventing fatalities. Leveraging advancements in machine learning techniques, various models have been developed for the timely diagnosis of breast cancer. Consequently, the precision and accuracy of these models in diagnosing breast cancer are paramount. This study focuses on the meticulous process of fine-tuning random forest hyperparameters using grid search to create a high-accuracy model for classifying breast cancer into benign and malignant categories. The optimal set of hyperparameters identified through the grid search includes: the number of trees in the forest (150), maximum depth of tree nodes (None), minimum number of samples required to split an internal node (2), minimum number of samples required at a leaf node (1), and the seed for random number generation (123). These refined parameters were utilized to train a random forest model, resulting in an outstanding 99.12% accuracy for breast cancer classification. The hyperparameter-optimized Random Forest classifier, as implemented in our study, showcases exceptional accuracy, surpassing not only previous results but also outperforming established algorithms. This noteworthy achievement underscores the robustness and efficacy of the proposed model in discerning intricate patterns within breast cancer datasets, making a substantial contribution to the ongoing quest for precision and effectiveness in breast cancer classification.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper