Los puntos clave no están disponibles para este artículo en este momento.
We investigate the use of prosody for the detection of frustration and annoyance in natural human-computer dialog. In addition to prosodic features, we examine the contribution of language model information and speaking style. Results show that a prosodic model can predict whether an utterance is neutral versus annoyed or frustrated with an accuracy on par with that of human interlabeler agreement. Accuracy increases when discriminating only frustrated from other utterances, and when using only those utterances on which labelers originally agreed. Furthermore, prosodic model accuracy degrades only slightly when using recognized versus true words. Language model features, even if based on true words, are relatively poor predictors of frustration. Finally, we find that hyperarticulation is not a good predictor of emotion; the two phenomena often occur independently.
Ang et al. (Mon,) studied this question.