July 1, 2024Open Access

Cultures of the AI paralinguistic in voice cloning tools

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

With AI-based voice cloning tools becoming more accessible to designers, we deem it imperative to understand their paralinguistic capabilities, limitations and cultures. Paralinguistics as a feld of study is concerned with how you say something rather than what you say, and new AI-based statistical voice synthesis tools difer signifcantly from previous methods. As such, they require asking novel questions and provoking new thoughts. This paper contributes by analyzing and evaluating various voice cloning platforms by looking at how they describe their own ability to produce three diferent paralinguistic elements: laughter, stuttering and pacing. We focus on text-to-speech and hybrid approaches to voice cloning, and follow up our analyses by attempting to produce these three paralinguistic elements using the voice cloning platform ElevenLabs’ voice synthesis tools. Conclusively, we draw on our results to pose questions for further investigation into what kinds of AI paralinguistic cultures can and should be designed.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo

Cite This Study

Ada et al. (Mon,) studied this question.

synapsesocial.com/papers/68e61f4bb6db6435875b1660 https://doi.org/https://doi.org/10.1145/3656156.3663708

Me gusta

Guardar

Ver artículo completo