Timely diagnosis of colorectal cancer (CRC) is crucial in reducing global cancer deaths. Physicians will benefit significantly by developing an automated prediction system using advanced technology to detect CRC at an early stage. Besides, existing AI-based diagnostics models primarily rely on imaging data, and their effectiveness is inconsistent when applied to clinical data. This study recommends a personalized one-dimensional residual network (ResNet51-Conv1D) structure adapted for structured clinical data analysis to learn hierarchical feature associations from a large-scale structured PLCO dataset. In this study, structured data were used, including nearly 1,54,892 participants aged 55 to 74, composed of 76,679 men and 78,213 women. The suggested model applies block segmentation to preserve the dependency between local features and employs two oversampling methods, SMOTE and ADASYN, in order to address class imbalance and improve representation of minority classes. The statistical effect of oversampling was determined by analyzing the model’s performance before and after oversampling. Before oversampling, the model’s CRC detection metrics were less accurate (recall = 1.83%; F1-score = 2.50%). When block segmentation was incorporated with a standard SMOTE and ADASYN oversampling methods, the sensitivity of CRC improved significantly. SMOTE with 40 and 50 segments performed best, with MCC values of 89.59% and 88.04%, balanced accuracy scores of 94.43% and 93.23%, and G-mean scores of 94.43% and 93.23%. Furthermore, ADASYN enhanced cancer detection robustness with MCC values of 83.33% and G-average scores of 90.41%. These results demonstrate that combining structured feature segmentation with imbalance-handling strategies improves model stability and minority-class detection, showing that the ResNet51-Conv1D framework is a reliable and efficient approach for early CRC detection using imbalanced structured clinical data.
Prasath et al. (Sat,) studied this question.