Key points are not available for this paper at this time.
In the development of goal-oriented dialogue systems, neural network topic modeling and clustering methods are traditionally used to extract user intentions and operator response scenario blocks. The emergence of generative large language models allows one to radically change the approach to generate dialogue scenarios in the form of a graph with context preservation. In this article we analyzed seven popular large language models on prepared test prompts for Russian and English languages for intent mining and named entity recognition. The present study aimed to investigate the effectiveness of two methods for constructing dialogues in goal-oriented dialogue systems: the heuristic-based approach with additional training on labeled data and the prompt-based approach without such training. The primary objective was to evaluate the impact of incorporating labeled dialogue data on the quality of constructed dialogues, with a focus on dialogue context. The study emphasized the need for dialogue systems to consider the dialogue context in constructing goal-oriented dialogues. The two approaches were compared for the MultiWOZ 2.2 and MANTiS dialogue corpora on a locally deployed LLaMA model. The results showed that the LlaMA model without training on labeled dialogues achieved a BERTScore metric value of 0.75 for the MultiWOZ dataset and 0.72 for the MANTiS dataset, and the LlaMA model with training on labeled dialogues achieved a BERTScore metric value of 0.85 for the MultiWOZ dataset and 0.82 for the MANTiS dataset. This finding has practical implications for the development of more effective dialogue systems in the field of customer service that can engage users in more productive and meaningful machine-to-human interactions.
Legashev et al. (Wed,) studied this question.