What question did this study set out to answer?

To explore the clinical heterogeneity of fatigue in individuals with inflammatory bowel disease (IBD) using a machine-learning framework.

February 28, 2026Open Access

Machine-learning approach to dissect the clinical heterogeneity of IBD-associated fatigue

Key Points

To explore the clinical heterogeneity of fatigue in individuals with inflammatory bowel disease (IBD) using a machine-learning framework.
Utilized patient-reported outcome data from three cohorts (n=2970) over multiple years.
Defined fatigue threshold (≥10/14 fatigue days) in 1604 patients.
Employed seven different machine-learning methods for analysis and validation.
Applied SHAP analysis to identify clinical predictive factors at an individual level.
Investigated distinct clusters of fatigue patients.
Machine learning models showed comparable performance (area under curve ~0.7) on external validation.
Identified five distinct fatigue cluster groups, including a subgroup with lower fatigue burden.
Confirmed significant differences in fatigue days between active disease (median 14) and remission (median 7), p<0.001.

Abstract

Objective Extreme and persistent fatigue affects >50% of individuals with inflammatory bowel disease (IBD), with similar prevalence across many common immune-mediated inflammatory diseases (IMIDs). Despite its ubiquity, human scientific studies have yet to fully explain the mechanistic basis of this complex symptom. One fundamental reason is our inability to account for the clinical heterogeneity and multifactorial nature of fatigue. Methods and analysis We present the conceptual machine-learning (ML) framework to dissect fatigue using one of the largest prospectively captured, real-world patient-reported outcome (PROs) on well-being from three contemporaneous cohorts (2020–present), totalling 2970 responses from 2290 participants across the UK and internationally, including non-IBD controls with 100 lines of clinical metadata. In parallel, our patient public involvement group performed thematic analysis of this PRO dataset, which identified fatigue as a key research priority ( www.musicstudy.uk ). Results We systematically defined the (1) threshold of fatigue as our primary outcome (≥10/14 fatigue days in 1604 patients (1151 responses in active disease and 1061 responses in remission; some patients measured longitudinally; median fatigue days 14 vs 7, respectively; p<0.001) to build our ML approach, (2) used routinely available clinical data that can be used at a population-level analysis, (3) employed seven different ML methods with external validation in three different cohorts in the UK, Spain and Australia (n=252), (4) employed Shapley Additive Explanations (SHAP) analysis to break down clinical heterogeneity and allow the examination of clinical predictive factors at an individual level; and finally, (5) investigated whether there are distinct clusters of fatigue patients. We found that ML models performed comparably (area under the curve/C-index ~0.7) on external validation with SHAP analysis showing interpretable, individualised fatigue drivers and five distinct fatigue cluster groups, including a subgroup with lower fatigue burden. Conclusions Our data provide the ML ‘roadmap’ to predict and deconstruct fatigue in IBD and potentially more widely in IMIDs, enabling patient-level dissection beyond symptom-based classification with the ability to integrate deep molecular data. This is a step towards future clinical-scientific artificial intelligence models with immediate clinical application to stratify patients for human experimental studies to better identify patient-level patterns associated with fatigue. Trial registration number NCT04760964 .

Bookmark

View Full Paper

Cite This Study

Chuah et al. (Sun,) studied this question.

synapsesocial.com/papers/69a2877b0a974eb0d3c03327 https://doi.org/https://doi.org/10.1136/bmjdh-2026-000037

Bookmark

View Full Paper