Speech-visual integration multimodal multi-agents framework for mobile operation | Synapse