Colorectal adenocarcinoma is caused in part by widespread epigenetic deregulation, yet the analysis of genome-wide DNA methylation of colorectal adenocarcinoma is complicated due to spatial correlation among CpG, multiscale patterns of differential methylation, and confounding cellular heterogeneity in bulk tissue. This study develops a simple yet effective framework that combines rigorous statistical modelling with modern machine learning-based prediction, on the Illumina 450K data from The Cancer Genome Atlas (TCGA) colorectal adenocarcinoma cohort. In our framework, differentially methylated regions (DMRs) were first detected using functional smoothing and permutation-based bump hunting, revealing both focal CpG island hypermethylation and broad hypomethylated domains spanning hundreds of kilobases. Next, we performed reference-based cell-type deconvolution and surrogate variable analysis (SVA) controlled immune/stromal admixture and hidden confounding effects, yielding well-calibrated single-site and region-level inference. Then, for tumour-status prediction, we empirically study the performance of classical Logistic Regression model against Random Forest, Gradient Boosting (XGBoost), and Feed-forward neural network; the results show that Logistic Regression achieves the lowest root-mean-square error and Brier score, reflecting its superior probability calibration. In general, our integrated framework provides biologically interpretable and highly predictive methylation signatures of colorectal cancer and offers a transferable baseline for future large-scale cancer epigenomics studies. Our code is publicly available at https://github.com/matekum/tcga-coad-methylation-ekum-2025a.
Building similarity graph...
Analyzing shared references across papers
Loading...
Matthew Iwada Ekum
Christian Heumann
Eno Emmanuella Akarawak
Building similarity graph...
Analyzing shared references across papers
Loading...
Ekum et al. (Fri,) studied this question.
www.synapsesocial.com/papers/68f43f09854d1061a58ac9ab — DOI: https://doi.org/10.1101/2025.10.16.25338162