What type of study is this?

This is a Cohort Study study (also classified as: Validation Study).

March 30, 2026Open Access

An explainable machine learning model to predict biofilm formation in a wound-care population with a high burden of chronic wounds

Key Points

To develop and validate an explainable machine learning model for predicting bacterial biofilm formation in chronic wounds.
Conducted a multicenter retrospective cohort study in two hospitals in China
Developed a training cohort of 385 and a testing cohort of 165 from 550 eligible patients
Utilized Boruta and LASSO for feature selection, identifying six key predictors
Employed stratified fivefold cross-validation to tune eight algorithms
Assessed model performance using discrimination, calibration, and decision curve analysis
Random forest achieved the best performance with AUC 0.929 in training, 0.861 in testing, and 0.837 in external validation
Feature analysis revealed debridement performed had the most influence on biofilm formation
Chronic wound and Diabetes mellitus predictors exhibited positive associations, while others indicated negative associations
The model showed high external performance with strong calibration and net benefit

Abstract

To develop and validate an explainable machine learning model for predicting bacterial biofilm formation in clinical wounds. We conducted a multicenter retrospective cohort study at two tertiary hospitals in China. From 550 eligible patients, a training cohort (n = 385) and a testing cohort (n = 165) were created, with an independent cohort (n = 300) for external validation. Predictors available at the index encounter were prespecified and imputed. Feature selection combined Boruta and LASSO, yielding six variables: Debridement performed, Chronic wound, Thermal therapy, Negative pressure wound therapy, Diabetes mellitus, and Silver dressing used. Eight algorithms were tuned with stratified fivefold cross-validation. Discrimination, calibration, and decision curve analysis were assessed. SHAP was used for model interpretation. Random forest achieved the best performance with AUC 0.929 in training, 0.861 in testing, and 0.837 in external validation, with good calibration and consistent net benefit on decision curves. SHAP ranked Debridement performed as most influential. Debridement performed, Thermal therapy, Negative pressure wound therapy, and Silver dressing used showed predominantly negative SHAP values, indicating inverse associations with biofilm formation. Chronic wound and Diabetes mellitus showed positive SHAP values. An interpretable random forest model with six routinely collected predictors accurately estimated biofilm formation in clinical wounds. The model’s strong external performance and biological plausibility suggest potential utility for early risk stratification and tailored wound management, warranting prospective validation in diverse clinical environments.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Jiang et al. (Sat,) studied this question.

synapsesocial.com/papers/69c9c553f8fdd13afe0bd2bc https://doi.org/https://doi.org/10.1186/s40001-026-04300-4

Bookmark

View Full Paper