Abstract Existing reinforcement learning (RL) approaches struggle to balance real-time decision-making with adaptive learning in dynamic healthcare environments. We propose a brain-inspired hybrid RL framework that integrates model-based (MB) planning and model-free (MF) reflexes via a dynamic meta-controller, neuro-symbolic clinical knowledge, counterfactual reasoning, and ethical safeguards. The framework is validated on a multimodal cerebral palsy (CP) dataset (86 patients) using NetLogo multi-agent simulations and Weka classifiers. A combined reward mechanism achieves 99% total reward accumulation, with 98% optimal reward in 95% of training episodes. Component analysis shows a 60% MB / 40% MF contribution, yielding a 15% improvement over standalone methods. Optimal weighting (0.7 MB, 0.3 MF) further enhances performance. External zero-shot validation on three public datasets (NTNU-HARChildren, EEG-EMG exoskeleton, D4RL) confirms generalizability (macro F1 84.3%, accuracy 81.7%, D4RL scores 68.5 and 62.3). Regression methods achieve correlation coefficients up to 0.94, and classification models (multinomial Naïve Bayes, logistic regression) attain 100% precision, recall, and F-measure. The framework provides a reliable, explainable, and simulation-validated solution for patient-centric autonomous decision-making.
Abdullah et al. (Mon,) studied this question.