What question did this study set out to answer?

The aim is to develop a system that generates and streams speech with minimal delay using voice samples.

April 12, 2026Open Access

Real-Time Voice Cloning and Streaming System

Key Points

The aim is to develop a system that generates and streams speech with minimal delay using voice samples.
Developed a real-time voice cloning system
Utilized voice encoding techniques and speech synthesis models
Implemented low-latency streaming with WebSocket communication
Tested on standard personal computers
Achieved reduced latency in audio playback
Enabled simultaneous speech generation and streaming
Enhanced real-time interaction capabilities
Suitable for various applications like virtual assistants and accessibility tools

Abstract

The rapid advancement of artificial intelligence and speech processing technologies has significantly enhanced human-computer interaction. However, traditional voice cloning and text-to-speech systems often rely on high-cost infrastructure and generate complete audio before playback, leading to increased latency. This paper presents a Real-Time Voice Cloning and Streaming System designed to generate and stream speech simultaneously with minimal delay. The system operates efficiently on standard personal computers and processes text along with a reference voice sample to produce speech incrementally. The proposed system integrates advanced speech synthesis models, voice encoding techniques, and a low-latency streaming pipeline using WebSocket-based communication. This enables continuous and smooth audio playback without pre-generating the entire audio. The system offers reduced latency, improved efficiency, and enhanced real-time interaction capabilities. It is suitable for applications such as virtual assistants, conversational agents, accessibility tools, and interactive platforms. Keywords: Voice Cloning, Real-Time Streaming, Text-to-Speech, Low Latency, AI.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Masroor Hussain

Lalithaditya S

S Kranthi Varma

Actions

Institutions

Aditya Birla (India)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Real-Time Voice Cloning and Streaming System

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study