November 30, 2024Open Access

Exploring the Application of the Whisper Model in Automatic Aphasia Speech Evaluation

Key Points

Whisper model shows an improvement to 70.76% accuracy for evaluating aphasia, greatly aiding in speech assessments.
Finetuning the Whisper model on the AphasiaBank dataset significantly enhanced its prediction capabilities.
The approach fine-tunes token prediction strategies to improve the accuracy of automatic aphasia evaluations.
These findings point to AI's potential in making speech evaluations more accessible amidst a shortage of speech-language pathologists.

Abstract

Aphasia is an everyday communication and speech disorder that impairs the ability of an individual to express through writing and speech. This paper explores the potential of using automatic aphasia speech evaluation models like the Whisper model to evaluate aphasia and potentially other speech impairments. Though effective, traditional methods for aphasia assessment are time-consuming and require specialized clinical expertise. To address these challenges, the study fine-tunes the Whisper using the AphasiaBank dataset to create a more efficient and accessible evaluation tool. The first trial of finetuning focused on the phonemic transcript generation part of the Whisper model and achieved a low accuracy of 56.89%. Minor token prediction errors and word omissions were the major reasons the prediction accuracy was so low. The second trial focused on the model’s prediction structure, included prompt and correction tokens, and showed improved accuracy by 70.76%. This indicates that contextual information and correctness tokens can significantly enhance the model’s performance. Further research and training of this model should be done on the entire AphasiaBank dataset because only a sample was available for this paper. The results of this paper show that there is a potential for AI models such as the Whisper model to be alternative tools for Aphasia testing and evaluation. This would remedy the scarcity of SLPs and make assessment more accessible to all individuals struggling with communication disorders by deploying an app or tool.

Exploring the Application of the Whisper Model in Automatic Aphasia Speech Evaluation

Key Points

Abstract

Cite This Study