The paper describes a process of modeling a speech synthesis system based on available acoustic data used for machine learning to obtain a model corresponding to the natural characteristics of speech. Described are the stages in creating a textual and phonetic–acoustic database adapted for training an automatic speech synthesis system. We present the architecture of the resulting speech synthesis software system, consisting of several functional modules. Information is given on the preparation of an experimental training database, the process of the system’s machine learning, setting the parameters of the neural network, and the results of the experiment on training the speech synthesis system. The problem of eliminating graphical homonymy in transcribing Chechen texts and ways to address it are considered.
E. S. Izrailova (Sun,) studied this question.