Objective Extreme and persistent fatigue affects >50% of individuals with inflammatory bowel disease (IBD), with similar prevalence across many common immune-mediated inflammatory diseases (IMIDs). Despite its ubiquity, human scientific studies have yet to fully explain the mechanistic basis of this complex symptom. One fundamental reason is our inability to account for the clinical heterogeneity and multifactorial nature of fatigue. Methods and analysis We present the conceptual machine-learning (ML) framework to dissect fatigue using one of the largest prospectively captured, real-world patient-reported outcome (PROs) on well-being from three contemporaneous cohorts (2020–present), totalling 2970 responses from 2290 participants across the UK and internationally, including non-IBD controls with 100 lines of clinical metadata. In parallel, our patient public involvement group performed thematic analysis of this PRO dataset, which identified fatigue as a key research priority ( www.musicstudy.uk ). Results We systematically defined the (1) threshold of fatigue as our primary outcome (≥10/14 fatigue days in 1604 patients (1151 responses in active disease and 1061 responses in remission; some patients measured longitudinally; median fatigue days 14 vs 7, respectively; p<0.001) to build our ML approach, (2) used routinely available clinical data that can be used at a population-level analysis, (3) employed seven different ML methods with external validation in three different cohorts in the UK, Spain and Australia (n=252), (4) employed Shapley Additive Explanations (SHAP) analysis to break down clinical heterogeneity and allow the examination of clinical predictive factors at an individual level; and finally, (5) investigated whether there are distinct clusters of fatigue patients. We found that ML models performed comparably (area under the curve/C-index ~0.7) on external validation with SHAP analysis showing interpretable, individualised fatigue drivers and five distinct fatigue cluster groups, including a subgroup with lower fatigue burden. Conclusions Our data provide the ML ‘roadmap’ to predict and deconstruct fatigue in IBD and potentially more widely in IMIDs, enabling patient-level dissection beyond symptom-based classification with the ability to integrate deep molecular data. This is a step towards future clinical-scientific artificial intelligence models with immediate clinical application to stratify patients for human experimental studies to better identify patient-level patterns associated with fatigue. Trial registration number NCT04760964 .
Chuah et al. (Sun,) studied this question.