Abstract Background: Breast cancer is the most commonly diagnosed cancer and a leading cause of cancer-related mortality among women in the United States. Early detection substantially improves survival, yet current imaging-based screening modalities are limited by reduced sensitivity in certain populations and high false-positive rates. Circulating non-coding RNAs (ncRNAs) represent stable and non-invasive biomarkers with significant potential to complement mammography and enhance diagnostic accuracy. This study aimed to develop and validate a robust circulating ncRNA-based signature for early breast cancer detection using an integrated machine learning framework. Methods: Circulating ncRNA profiles were generated for 413 individuals (216 breast cancer; 197 non-cancer). A three-stage design—discovery, internal testing, and external validation—was implemented. In the discovery cohort (n=248), we evaluated a broad ensemble of 12 machine learning frameworks and 111 model combinations to establish a consensus-based diagnostic signature (ncRNASig) comprising 16 circulating ncRNAs. Model development emphasized stability, reproducibility, and cross-cohort generalizability. The resulting ncRNASig was evaluated in an independent internal testing cohort (n=175) and further validated using multiple external datasets from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO). Results: The 16-ncRNA diagnostic signature demonstrated high discriminatory power in the discovery cohort, achieving an AUC of 97.4% for distinguishing breast cancer from non-cancer samples. The ncRNASig also differentiated cancer from benign conditions (AUC = 96.1%) and cancer from normal controls (AUC = 100%). Subtype analyses showed consistently strong performance for Luminal A, Luminal B, HER2-enriched, and triple-negative breast cancers (AUCs 96-98%). These results were reproducible in the internal testing cohort and across multiple independent external datasets, supporting the robustness and generalizability of the ncRNASig. Conclusion: This study identifies and validates a circulating ncRNA-based signature with strong diagnostic performance across breast cancer subtypes and independent cohorts. The findings support the potential of integrating ncRNA-driven liquid biopsy assays with current screening approaches to enhance early breast cancer detection. Further development and clinical translation are warranted. Citation Format: Yuanyuan Fu, Mayumi Jijiwa, Zhanwei Wang, Hua Yang, Youping Deng. Machine learning-derived circulating ncRNA signature for early detection of breast cancer abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 2538.
Fu et al. (Fri,) studied this question.