May 12, 2024Open Access

A Non-Intrusive Approach to Assessing Dysarthria Severity: Advancing Clinical Diagnosis

GLGanjun LiuTianjin University XHXiaohui HouWorld Bank GMGe MengTianjin University

Key Points

Key points are not available for this paper at this time.

Abstract

AI-driven severity assessment techniques for dysarthric disorders show promise in aiding speech-language pathologists with diagnostics and therapeutic follow-ups for patients. Existing solutions generally focus on the average intelligibility and hoarseness of the individual speaker's speech (i.e., speaker-level classification). This potentially ignores the slight variations in pronunciation attributed to the speaker's dysarthric disorders, e.g., /t/ and /d/. To address this issue, we rethink the inherent differences in the dysarthria speech, and propose a non-intrusive severity assessment approach called DysarNet. Specifically, we first design a prosodic emphasis module based on frame-level speech features to highlight the fine-grained temporal changes including pronunciation content, rhythm, and timing. Second, we design a multi-scale aggregation strategy to collect statistical cues on articulatory information at different scales, i.e., frame-level and utterance-level. By doing so, multi-scale prosody and articulatory cues are directly assist the prediction network for assessing dysarthria severity from multiple views, and naturally achieve speaker-independent generalization ability. Experimental results on VCC 2018 and TORGO datasets show that our DysarNet excels in assessing dysarthria severity.

Ask AI

Helpful

Bookmark

View Full Paper