Key points are not available for this paper at this time.
The lack of suitable training and testing data is currently a major roadblock in applying machine-learning techniques to dialogue management.Stochastic modelling of real users has been suggested as a solution to this problem, but to date few of the proposed models have been quantitatively evaluated on real data.Indeed, there are no established criteria for such an evaluation.This paper presents a systematic approach to testing user simulations and assesses the most prominent domain-independent techniques using a large DARPA Communicator corpus of human-computer dialogues.We show that while recent advances have led to significant improvements in simulation quality, simple statistical metrics are still sufficient to discern synthetic from real dialogues.
Schatzmann et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: