WiVi-UF: Unified feature learning in cross-modal transformers with WiFi and vision data fusion for enhanced human activity recognition | Synapse