What question did this study set out to answer?

The research aims to enhance oral proficiency feedback in foreign language learning using AI technologies.

April 5, 2026Open Access

Multimodal AI-enabled Feedback Mechanism for Oral Foreign Language Proficiency: Integrating Speech Recognition and Sentiment Analysis

Key Points

The research aims to enhance oral proficiency feedback in foreign language learning using AI technologies.
Developed a multimodal feedback mechanism integrating ASR and SA technologies.
Implemented a weighted fusion algorithm to combine phonetic and emotional data.
Conducted an experiment with 86 foreign language learners using experimental and control groups.
Achieved ASR recognition accuracy of 92.3%.
Secured a pronunciation error detection rate of 88.7%.
Reduced feedback response time to under two seconds.
Experimental group showed an average oral proficiency score increase of 15.6%.
Learner satisfaction rate reached 89.5%.

Abstract

Oral proficiency is a core competency in foreign language learning, yet the traditional foreign language classroom often faces challenges such as inadequate oral feedback, delayed evaluation, and subjective assessment bias. To address these issues, this study proposes a multimodal AI-enabled feedback mechanism that integrates automatic speech recognition (ASR) and sentiment analysis (SA) technologies. First, the ASR module extracts phonetic features (e.g., pronunciation accuracy, fluency) from learners’ oral outputs, while the SA module captures emotional cues (e.g., confidence, anxiety) through vocal prosody and textual semantics. Then, a weighted fusion algorithm is designed to integrate the two modalities of information, generating personalized and actionable feedback. To verify the effectiveness of the mechanism, an experiment was conducted with 86 foreign language learners divided into experimental and control groups. Objective evaluation indicators include ASR recognition accuracy, pronunciation error detection rate, and feedback response time; subjective indicators include learner satisfaction, perceived usefulness, and teacher evaluation consistency. Experimental results show that the proposed mechanism achieves an ASR recognition accuracy of 92.3%, a pronunciation error detection rate of 88.7%, and a feedback response time of less than two seconds. Compared with the traditional feedback method, the experimental group’s oral proficiency score increased by 15.6% on average, and the learner satisfaction rate reached 89.5%. This study provides a new technical solution for improving the efficiency and personalization of oral feedback in foreign language classrooms, and enriches the application research of multimodal AI in language education.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper