This study proposes an intelligent grammar correction method integrating multi-strategy Pinyin detection and hierarchical data augmentation to address common errors in Chinese English learners' writing. A dual-strategy Pinyin detection algorithm combines syllable tree matching and linguistic rules to accurately identify and preserve Pinyin segments. A hierarchical data augmentation approach employs rule-based and model-based back-translation to build diverse parallel corpora targeting typical learner errors. Based on the Transformer architecture, the grammar correction model treats error correction as a sequence-to-sequence task. Results show the Pinyin detector achieves 99. 95% accuracy, processing 5, 386 words/second with 13. 02 MB memory usage. The correction model attains a 40. 58 F₀. ₅ score and 49. 56% accuracy on CoNLL-2014. On CLEC subsets, it achieves 87. 5%, 90. 2%, and 85. 9% accuracy for article, subject-verb agreement, and verb tense errors, respectively. Pinyin false corrections dropped from 65% to 1. 8%, demonstrating significant improvement in handling Chinese learners' English writing.
Lingling Song (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: