Abstract A complex motor speech disorder, dysarthria makes diagnosis and its severity classification extremely challenging, thereby affecting suitable therapy and intervention strategies. This paper presents a deep learning-based method based on TORGO dataset to overcome these challenges. Moreover, the problem statement focuses on the difficulty of exactly spotting dysarthria and assessing its degree of severity using traditional methods, which usually lack precision and efficiency. This work presents a new method combining advanced acoustic feature extraction techniques, such Mel-frequency cepstral coefficients (MFCC) and spectrogram analysis, with state-of- the-art neural network and its hybrid architectures such convolutional neural networks (CNNs), long- and short-term memory (LSTM) with CNN, and gated recurrent unit (GRU) combined with CNN. It offers an extensive framework for assessing the degree of dysarthria and also uses short-time Fourier transform (STFT) images obtained from a dataset for severity classification. The proposed CNN model obtained an accuracy of 98.2% using Mel-spectrogram for detecting the dysarthria and the hybrid CNN-GRU model reached an accuracy of 97% using the STFT images for classifying dysarthria based on its severity. Moreover, this work highlights the ability of proposed deep learning models to offer tailored therapy approaches depending on degree of severity and automates dysarthria diagnosis process.
Building similarity graph...
Analyzing shared references across papers
Loading...
Venugopal Koikal Varma
Arun Jana
Arijit Samal
Building similarity graph...
Analyzing shared references across papers
Loading...
Varma et al. (Tue,) studied this question.
www.synapsesocial.com/papers/689521de9f4f1c896c427e1e — DOI: https://doi.org/10.1007/s42452-025-07260-2