The Sonic Real-Time Enterprise Voice Agent is a real-time enterprise conversational AI system designed to enable low-latency voice interaction with enterprise platforms such as ServiceNow and Salesforce. The system follows a decoupled microservice architecture in which FastAPI manages business logic and enterprise integration, while Pipecat handles real-time audio processing. LiveKit provides secure WebRTC-based streaming for bidirectional communication. Experimental results demonstrate sub-500 millisecond response latency, stable multi-session handling, and secure RBAC-based access control suitable for enterprise deployment.
Kumar et al. (Sun,) studied this question.