March 14, 2026Open Access

Sign Language Segmentation Models and Their Impact on Automatic Translation Systems

Abstract

This work investigates automatic segmentation of Brazilian Sign Language videos for translation systems, addressing challenges of the visual-spatial modality of signed languages. We introduce the JW-Bible-Libras dataset, the largest resource for this task, and evaluate two segmentation approaches: Optical Flow-based models and Spatio-Temporal Graph Convolutional Networks (ST-GCN). Segmentation performance is analyzed both intrinsically and in relation to downstream translation using the gloss-free Sign2GPT architecture. Results show that the nine-layer ST-GCN with bidirectional LSTM achieves the best segmentation results (F1: 0.7358, IoU: 0.5820), while the unidirectional variant yields the strongest translation scores (BLEU1: 9.31, ROUGE: 9.49). Notably, a simple heuristic based on average sentence duration performs competitively, highlighting the gap between segmentation accuracy and translation quality. Our findings demonstrate the importance of segmentation strategies while revealing opportunities for integrating linguistic cues and boundary-aware learning to advance sign language translation.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper

Cite This Study

Ramos et al. (Tue,) studied this question.

synapsesocial.com/papers/69b4b9db18185d8a39801e95 https://doi.org/https://doi.org/10.22456/2175-2745.150939

AIに質問

Bookmark

View Full Paper