December 1, 2014

Machine learning approaches to improving pronunciation error detection on an imbalanced corpus

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

In this paper, we investigate the task of phone-level pronunciation error detection as a binary classification problem, the performance of which is heavily affected by the imbalanced distribution of the classes in a manually annotated data set of non-native English. In order to address problems caused by this extreme class imbalance, methods for cost-sensitive learning (weighting inversely proportional to class frequencies) and over-sampling of synthetic instances (SMOTE) are investigated in order to improve classification performance. Experiments using classifiers consisting of features based on acoustic phonetics and word identity demonstrate that these machine learning approaches lead to performance improvements over the baseline system based on the extremely imbalanced data. In addition, several different types of classifiers were compared. Finally, the paper analyzes the robustness of classifier performance across different phones.

Preguntar a la IA

Me gusta

Guardar

Cite This Study

Yang et al. (Mon,) studied this question.

synapsesocial.com/papers/6a186a911ca866914fc99ca7 https://doi.org/https://doi.org/10.1109/slt.2014.7078591

Preguntar a la IA

Me gusta

Guardar