Lower-limb exoskeleton systems require robust gait phase recognition to deliver adaptive and timely assistance for individuals with mobility impairments. While previous approaches using cross-modal metric learning (ACMML) have shown promise, they struggle with complex temporal dependencies across heterogeneous sensor modalities and real-time computational constraints. This paper proposes a novel Hierarchical Transformer-based Adaptive Cross-Modal Gait Recognition (HT-ACMGR) framework that fundamentally advances the state-of-the-art through three key innovations: (1) a hierarchical transformer encoder architecture that captures multi-scale temporal patterns across surface electromyography (sEMG), inertial measurement units (IMUs), and force sensors; (2) an adaptive cross-attention mechanism that dynamically learns optimal feature alignments between modalities in response to user-specific gait variations; and (3) a physics-informed regularization layer that incorporates biomechanical constraints for improved generalization. Extensive experiments on a diverse subject cohort ( N = 45 ) across multiple locomotion tasks demonstrate that HT-ACMGR achieves 97.2% classification accuracy with < 50 ms latency, representing a 2.1% absolute improvement over prior ACMML approaches and outperforming state-of-the-art CNN-LSTM and GRU-based methods. Notably, the framework exhibits superior robustness to sensor noise and seamless adaptation to real-time gait pattern changes. These advances enable the deployment of intelligent exoskeletons in diverse controlled and semi-unstructured clinical and community environments with reliable human–robot interaction.
Farzana et al. (Mon,) studied this question.