What type of study is this?

This is a Experimental Study study.

October 3, 2025Open Access

MultiStream-LLM: Bridging Modalities for Robust Sign Language Translation

Key Points

MultiStream-LLM significantly improves sign language translation accuracy by focusing on specialized predictors.
The framework achieves a BLEU-4 score of 23.5 on the How2Sign benchmark and 73.2% accuracy for fingerspelling.
This modular approach employs multiple expert networks to decode various modalities before merging them for translation.
By separating recognition tasks, it establishes a more effective method for translating complex sign language elements.

Abstract

Despite progress in gloss-free Sign Language Translation (SLT), monolithic end-to-end models consistently fail on two critical components of natural signing: the precise recognition of high-speed fingerspelling and the integration of asynchronous non-manual cues from the face. Recent progress in Automated Sign Language Translation with Large Language Models has side stepped this challenge, forcing a single network to learn these simultaneously resulting in poor performance when tasked with translating crucial information such as names,places, and technical terms. We introduce MultiStream-LLM, a modular framework designed to overcome these limitations. Our approach employs separate, specialized predictors for continuous signing, fingerspelling, and lipreading. Each expert network first decodes its specific modality into a sequence of tokens. These parallel streams are then fused by a lightweight transformer that resolves temporal misalignments before passing the combined representation to a Large Language Model (LLM) for final sentence generation. Our method establishes a new state-of-the-art on the How2Sign benchmark with a BLEU-4 score of 23.5 and achieves 73.2% letter accuracy on the challenging ChicagoFSWildPlus fingerspelling dataset. These results validate our core hypothesis: by isolating and solving distinct recogni tion tasks before fusion, our multi-expert approach provides a more powerful and effective pathway to robust, high-fidelity sign language translation.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Thomas et al. (Wed,) studied this question.

synapsesocial.com/papers/68e02f40f0e39f13e7fa28e0 — DOI: https://doi.org/10.48550/arxiv.2509.00030

Authors

Marshall Thomas

Boston Children's Hospital

Edward Fish

University of Surrey

Richard Bowden

University of East Anglia

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

MultiStream-LLM: Bridging Modalities for Robust Sign Language Translation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion