Communication barriers remain a daily challenge for hearing and speech-impaired individuals, with most solutions limited to one-way translation, fixed vocabularies, or cloud-dependent operation. Existing mobile apps and desktop tools often provide only unidirectional support (e.g., speech→text), rely on cloud connectivity that raises latency and privacy concerns, or use non-portable hardware that limits adoption. The AI Smart Badge is a portable, on-device AI system enabling real-time, bidirectional communication between signers and non-signers. It supports both directions (sign→text with synthesized voice, and speech→text with sign rendering with multilingual capability validated in Arabic and English). A defining contribution is a customizable sign-language layer: users and caregivers can create, label, and update personalized gestures to match dialects or individual abilities, yielding a living vocabulary rather than a fixed set. The vision pipeline uses camera-based hand-landmark extraction and a lightweight neural classifier for gesture recognition; the audio pipeline combines speech-to-text with multilingual text-to-speech to produce natural voice output. All inference runs locally on Raspberry Pi 5 for low latency and offline operation. In prototype evaluations, the gesture recognizer achieved an average accuracy of 92% across five gesture classes (n=250 tests), speech recognition exhibited word-error rates from 5% to 25% across 40–70 dB ambient noise, and end-to-end interaction remained sub-second. We detail engineering trade-offs model size versus latency, audio front-end robustness, variability and mitigations, input normalization, temporal smoothing, and on-device caching. Results demonstrate feasible, multilingual, and customizable sign-to-text/voice and speech-to-sign translation on commodity embedded hardware, with paths to scale vocabulary, languages, energy efficiency, and usability in diverse real-world settings.
Tubaishat et al. (Sun,) studied this question.