Key points are not available for this paper at this time.
In the process of speech recognition, it is especially crucial to precisely locate endpoints of the input utterance to be free of non-speech regions. This paper proposes a novel approach that finds robust features for endpoint detection in a noisy in-car environment. In the proposed method, we integrate both the widely used energy and entropy to form a new feature that possesses advantages of each individual while compensating for the drawback of each other. By monitoring the transition of the extracted new features, more precise endpoints could be found. Experiments in a real noisy environment, inside a Honda Civic car with background radio music and free chat, reveal an accuracy improvement which reached over 10% higher compared with a pure energy-based algorithm.
Huang et al. (Thu,) studied this question.