October 11, 2006

Generalized Statistical Modeling of Pronunciation Variations using Variable-length Phone Context

Key Points

Key points are not available for this paper at this time.

Abstract

Pronunciation variation modeling is one of the major issues in automatic transcription of spontaneous speech. We present statistical modeling of subword-based mapping between baseforms and surface forms using a large-scale spontaneous speech corpus (CSJ). Variation patterns of phone sequences are automatically extracted together with their contexts of up to two preceding and following phones, which are decided by their occurrence statistics. We then derive a set of rewrite rules with their probabilities and variable-length phone contexts. The model effectively predicts pronunciation variations depending on the phone context using a back-off scheme. Since it is based on phone sequences, the model is applicable to any lexicon to generate appropriate surface forms. The proposed method was evaluated on two transcription tasks whose domains are different from the training corpus (CSJ), and significant reduction of word error rates was achieved.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Yuya Akita

Tokyo Medical University

Tatsuya Kawahara

Kyoto University

Actions

Institutions

Kyoto University

Japan Science and Technology Agency

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Generalized Statistical Modeling of Pronunciation Variations using Variable-length Phone Context

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study