Oral proficiency is a core competency in foreign language learning, yet the traditional foreign language classroom often faces challenges such as inadequate oral feedback, delayed evaluation, and subjective assessment bias. To address these issues, this study proposes a multimodal AI-enabled feedback mechanism that integrates automatic speech recognition (ASR) and sentiment analysis (SA) technologies. First, the ASR module extracts phonetic features (e.g., pronunciation accuracy, fluency) from learners’ oral outputs, while the SA module captures emotional cues (e.g., confidence, anxiety) through vocal prosody and textual semantics. Then, a weighted fusion algorithm is designed to integrate the two modalities of information, generating personalized and actionable feedback. To verify the effectiveness of the mechanism, an experiment was conducted with 86 foreign language learners divided into experimental and control groups. Objective evaluation indicators include ASR recognition accuracy, pronunciation error detection rate, and feedback response time; subjective indicators include learner satisfaction, perceived usefulness, and teacher evaluation consistency. Experimental results show that the proposed mechanism achieves an ASR recognition accuracy of 92.3%, a pronunciation error detection rate of 88.7%, and a feedback response time of less than two seconds. Compared with the traditional feedback method, the experimental group’s oral proficiency score increased by 15.6% on average, and the learner satisfaction rate reached 89.5%. This study provides a new technical solution for improving the efficiency and personalization of oral feedback in foreign language classrooms, and enriches the application research of multimodal AI in language education.
Yang He (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: