March 14, 2024

Real-Time Speech to Sign Language Translation Using Machine and Deep Learning

Key Points

Key points are not available for this paper at this time.

Abstract

This approach critically analyzed the current technology for speech-to-sign language translation. To extract key features from the input speech signal, audio processing techniques such as Mel-frequency cepstral coefficients (MFCCs), Fast Fourier transform (FFT), and Discrete Cosine Transform (DCT) are used first. Combining the advantages of both architectures—convolutional neural networks (CNNs) and recurrent neural networks (RNNs)—creates a potent feature extraction method that can effectively extract features with both temporal and spatial patterns. Following their retrieval, the attributes are input into a Text-to-Sign (TTS) module, which uses a Convolutional Long Short-Term Memory (ConvLSTM) network to generate the proper sign language sequence. The created sequence of sign language is animated using motion graphics (MG). This technology uses motion capture data that has already been recorded to create realistic and expressive sign language motions. Motion graphics offer an appropriate choice for real-time translation applications that will be using Long Short-Term Memory Long time (LSTM) by balancing usability with the size of the necessary motion capture database.

Bookmark

Cite This Study

Rai et al. (Thu,) studied this question.

synapsesocial.com/papers/68e74210b6db6435876bbc97 https://doi.org/https://doi.org/10.1109/icrito61523.2024.10522437