What question did this study set out to answer?

The research aims to assess how a transformer-based chatbot influences English speaking skills in blended learning environments.

March 13, 2026Open Access

The impact of transformer-based chatbots on second language speaking competence in blended learning

Key Points

The research aims to assess how a transformer-based chatbot influences English speaking skills in blended learning environments.
Gathered data through speech recordings, interaction logs, and engagement monitoring.
Applied a Wiener filter for denoising and min-max scaling for normalizing metrics.
Utilized Mel-Frequency Cepstral Coefficients for voice analysis and TF-IDF for textual interactions.
Employed DSM-RoBERTa for speech recognition and dialogue generation.
Achieved 98.85% accuracy and 98.23% precision in assessing speaking skills.
Reported 23.2% learning efficiency and 88% user satisfaction.
Demonstrated significant improvements in pronunciation accuracy and fluency.
Ensured consistent performance across cross-validation folds.

Abstract

This research effort focuses on employing a transformer-based chatbot to improve English speaking skills in mixed learning contexts. Data about learners were gathered via speech recordings, interaction logs, and engagement monitoring. Data pretreatment steps included denoising speech with a Wiener filter, filling in missing values in textual interaction metrics, and normalizing posture and behavioral cues with min-max scaling. For feature extraction, voice analysis utilized Mel-Frequency Cepstral Coefficients (MFCCs), Term Frequency-Inverse Document Frequency (TF-IDF) was applied to textual interactions, and ResNet-18 for behavioral and engagement metrics. The transformer-based Dynamic Slime Mold-mutated Robustly Optimized Bidirectional Encoder Representations from Transformers-Pretraining Approach (DSM-RoBERTa) for effective speech recognition and context-sensitive dialogue generation. It leverages RoBERTa for enhanced language understanding and fine-tunes parameters for improved accuracy. The chatbot provided incremental practice at the phonetic, semantic, and freestyle levels, offering real-time feedback on pronunciation accuracy, fluency, and engagement while tracking learner progress. Accuracy, precision, recall, F1-score, word error rate (WER), learning efficiency, and user satisfaction were all used to assess model performance. On a held-out test set of 2500 learners, DSM-RoBERTa scored an 98.85% accuracy, 98.23% precision, 98.31% F1-score, 97.15% recall, 23.2% learning efficiency, 88% user satisfaction with WER of 0.068, with consistent results throughout cross-validation folds. These findings show that the DSM-RoBERTa framework could promote adaptive, context-aware, and progressive speaking practice, resulting in a scalable, immersive, and personalized language learning environment. The concept provides a dependable option for blended English as a Foreign Language (EFL) training, linking classroom learning with independent practice while improving.

Mark Helpful

Bookmark

Relay

View Full Paper