What question did this study set out to answer?

June 14, 2026

Multimodal Sign Language Generation Using Deep Learning and Hybrid Optimization-Based Framework

Key Points

This work aims to enhance multimodal sign language generation using deep learning and hybrid optimization techniques.
Developed a framework integrating audio, text, and video data for sign language generation.
Utilized a combination of Recurrent Neural Networks, ResNet50, BERT, and hybrid optimization for feature extraction and selection.
Implemented an attention-based deep learning model using LSTM, CNN, and transformer architectures for sign generation.
Achieved an accuracy of 0.998817, indicating high model performance.
Reported a specificity of 0.998799 and sensitivity of 0.990125, highlighting effective sign language detection.

Abstract

Sign language is a diversified language used for passing messages differently using hand signs, facial expressions and body actions. Online content is available yet relatively high, yet not enough for people with hearing impairment. This work proposes a new approach to multimodal Sign Language Generation (SLG) that uses hybrid optimization methods combined with Deep Learning (DL) algorithms. The proposed framework improves the efficiency of SLG by executing and fusing data from audio, text and video. This proposed methodology contains a preprocessing phase, feature extraction phase, feature selection with a hybrid optimization phase, multimodal fusion phase, DL modeling phase and SLG phase. Temporal features are extracted using a Recurrent Neural Network (RNN) from audio, a spatial feature from video using ResNet50 and textual features using BERT. Then, feature selection is done using the Hybrid Grey Wolf and Fruit Fly Optimization (HGWFFO) technique to choose the best features. The proposed framework integrates an attention-based DL model that uses Long Short-Term Memory (LSTM) networks, Convolutional Neural Networks (CNN) and transformer models for feature fusion and sign generation. The SLG is performed based on a loop, and the feedback is collected during the calibration process in real time. The proposed framework is intended to contribute to the state of the art in SLG by offering an effective approach and gaining better performance and readability. The developed model gained an accuracy of 0.998817, a specificity of 0.998799 and a sensitivity of 0.990125.

Bookmark

Multimodal Sign Language Generation Using Deep Learning and Hybrid Optimization-Based Framework

Key Points

Abstract

Cite This Study

Also Consider

Also Consider