Key points are not available for this paper at this time.
The design and the use of a hierarchical tree structure of hidden Markov model (HMM) networks based on a dynamic clustering of the speakers covered during the training process is described. During the recognition process, a speaker is assigned to a specific network of models through a series of decisions in a tree. Once the assignment is done, recognition is performed within this network on a one-model-per-word basis. Given databases of over 500 speakers and vocabulary sizes of 21, 30, and 36 words, results show that there is only a nonsignificant improvement over two-models-per-word systems. However, recognition is twice as fast.>
Mathan et al. (Wed,) studied this question.