Abstract Background: Breast cancer is marked by significant and persistent racial and ethnic disparities in incidence and outcomes. For example, Black women are more likely to be diagnosed with triple negative breast cancer, the most aggressive subtype with the worst outcomes. Disparities in cancer are influenced by various factors, including social, environmental, genetic, and biological elements, although the relative contribution of each remains unclear. Limited progress has been made in understanding how these factors interact to drive the disproportionate disease burden, partly due to the lack of racial and ethnic diversity in research cohorts and the absence of heterogeneous data sources. Methods: In this study, we utilized extensive data from the latest release of the All of Us Research Program (version 8) to examine the influence of potential risk factors on breast cancer prevalence. We employed a rigorous breast cancer phenotyping algorithm to define case (n=12265) and control (n=12265) groups among the program participants. Comprehensive data analysis abstracted risk-associated factors including demographic and clinical variables, such as body mass index (BMI), along with lifestyle indicators like smoking and drinking habits. Furthermore, various social determinants of health (SDoH) were assessed through geolocation-based indices such as the area deprivation index or validated survey-based measures. We developed predictive linear regression and extreme gradient boosting (XGBoost) models to evaluate how these factors independently and collectively influence breast cancer prevalence. Results: Our feature importance ranking of XGBoost models quantifies how medical, lifestyle, and socioeconomic factors significantly influence breast cancer risk profiles and help differentiate between cases and controls. Key factors include breast cancer risk gene carrier status, genetic ancestry, BMI, smoking status, alcohol consumption habits, and certain SDoH indicators, each contributing variably to the breast cancer risk profile. Discussion: We observed a multifaceted interaction of clinical and socioeconomic factors affecting breast cancer risk through the models we developed. This highlights the importance of integrated models that consider various factors to inform precise and effective breast cancer prevention strategies. Citation Format: Yuewen Qi, Kyriaki Founta, Devin Gee, Kassidy Lundy-Perez, Nyasha Chambwe. Identifying breast cancer risk factors using comprehensive data from the All of Us research program abstract. In: Proceedings of the 18th AACR Conference on the Science of Cancer Health Disparities; 2025 Sep 18-21; Baltimore, MD. Philadelphia (PA): AACR; Cancer Epidemiol Biomarkers Prev 2025;34(9 Suppl):Abstract nr B002.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yuewen Qi
Kyriaki Founta
David L. Gee
Cancer Epidemiology Biomarkers & Prevention
Northwell Health
Feinstein Institute for Medical Research
Building similarity graph...
Analyzing shared references across papers
Loading...
Qi et al. (Thu,) studied this question.
www.synapsesocial.com/papers/68d464f131b076d99fa64322 — DOI: https://doi.org/10.1158/1538-7755.disp25-b002