To enhance the intelligence of pronunciation error detection and feedback precision in online oral English teaching, this study designs a system combining speech recognition and machine learning.Its core detection module uses the MFCC-DBN model with feature fusion, and builds an SVM-based multi-classifier.Experimental data comes from the CSTR VCTK Corpus and the speech accent archive, containing 1,610 expert-annotated phoneme samples.The model yields high accuracy for both sample-sufficient and small-sample error types.Compared with LDA-SVM and Wav2Vec2.0-SVM, it outperforms them in accuracy and standard error.Results prove the fusion model's stronger robustness and efficiency with limited data, offering a practical technical approach to boost learners' cross-cultural communication competence.
Qingdi Si (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: