Key points are not available for this paper at this time.
Pronunciation variation modeling is one of the major issues in automatic transcription of spontaneous speech. We present statistical modeling of subword-based mapping between baseforms and surface forms using a large-scale spontaneous speech corpus (CSJ). Variation patterns of phone sequences are automatically extracted together with their contexts of up to two preceding and following phones, which are decided by their occurrence statistics. We then derive a set of rewrite rules with their probabilities and variable-length phone contexts. The model effectively predicts pronunciation variations depending on the phone context using a back-off scheme. Since it is based on phone sequences, the model is applicable to any lexicon to generate appropriate surface forms. The proposed method was evaluated on two transcription tasks whose domains are different from the training corpus (CSJ), and significant reduction of word error rates was achieved.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yuya Akita
Tokyo Medical University
Tatsuya Kawahara
Kyoto University
Kyoto University
Japan Science and Technology Agency
Building similarity graph...
Analyzing shared references across papers
Loading...
Akita et al. (Wed,) studied this question.
synapsesocial.com/papers/6a185f0a6a9454a71265c282 — DOI: https://doi.org/10.1109/icassp.2005.1415207