What question did this study set out to answer?

The aim is to develop a monitoring system for daily recovery classification in community stroke survivors using machine learning.

April 13, 2026Open Access

Daily Machine Learning-Based Recovery Classification for Community Stroke Survivors: Development of a Three-Classifier Monitoring System

Key Points

The aim is to develop a monitoring system for daily recovery classification in community stroke survivors using machine learning.
Generated a synthetic dataset with 2,000 records and 16 clinical features.
Trained three supervised classifiers: Logistic Regression, Random Forest, and LightGBM.
Selected the best model using macro-average ROC-AUC metrics.
Developed a rule-based recommendation engine above the top classifier.
Deployed the system as a publicly accessible Streamlit web application.
LightGBM achieved the highest accuracy of 92.1% and ROC-AUC of 0.991.
Random Forest and Logistic Regression had accuracies of 91.3% and 90.2%, respectively.
Days post-stroke, mobility score, and exercise completion were identified as key influential variables.

Abstract

ABSTRACT Background: The majority of stroke rehabilitation occurs in the community without objective daily monitoring. Patients and families lack accessible tools to identify recovery plateaus or deterioration between clinical appointments. Methods: A synthetic dataset of 2,000 records (16 clinical features, 3 outcome classes) was generated using published stroke rehabilitation variable distributions. Three supervised classifiers — Logistic Regression, Random Forest, and LightGBM — were trained on a stratified 80/20 partition. Model selection was automated by macro-average ROC-AUC. A rule-based recommendation engine was layered above the best classifier. The system was deployed as a Streamlit web application. Results: LightGBM achieved the highest performance (accuracy 92.1%, ROC-AUC 0.991), followed by Random Forest (91.3%, 0.982) and Logistic Regression (90.2%, 0.970). Feature importance analysis identified days post-stroke, mobility score, and exercise completion as the three most influential variables. The deployed application is publicly accessible at https://stroketracker.streamlit.app/. Conclusion: A three-classifier ensemble approach achieves robust classification performance for daily stroke recovery monitoring using self-reported variables. The system addresses a documented gap in community rehabilitation monitoring. Future clinical validation on real patient data under ethical approval is required before translation.

Daily Machine Learning-Based Recovery Classification for Community Stroke Survivors: Development of a Three-Classifier Monitoring System

Key Points

Abstract

Cite This Study