Breast cancer is one of the most significant health issues affecting the health of women worldwide. Accurate diagnosis and prognosis play a vital role in patients’ health conditions, enhancing survival rates and reducing mortality. Due to the numerous factors involved in diagnosing this disease, the conventional diagnosis process is expensive, exhausting, and time-consuming. The research analyzed the Wisconsin diagnostic cancer dataset (WDBC) and Wisconsin prognostic breast cancer dataset (WPBC) using machine learning algorithms (MLA). The datasets contain cell nucleus attributes collected through Fine Needle Aspiration (FNA) of the suspicious breast tissue specimen. The features of the FNA collected specimen were extracted through Digital Image Processing (DIP). They transformed the datasets into a lower dimension using Principle Component Analysis (PCA), followed by training in MLA such as Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Linear Discriminant Analysis (LDA), Decision Tree (DT), Logistic Regression (LR) and Gaussian Naive Bayes (GNB). The result analyzed various performance parameters like accuracy, precision, recall, F1 score, positive predictive value, negative predictive value, and R2 score. Finally, based on the results, the best model was developed as a Web-Application Programming Interface (Web-API) using a Microsoft Azure ML Studio.
Sankardoss et al. (Sat,) studied this question.