The growing ubiquity of intelligent devices, coupled with the increasing computational demands of modern machine learning models, has led to a critical need for collaborative frameworks that effectively distribute learning and inference tasks across heterogeneous computing environments. Collaborative learning between on-device small models and cloud-based large models has emerged as a promising paradigm that seeks to balance the trade-offs between computational efficiency, latency, privacy, and model accuracy. By leveraging the strengths of both local devices-such as smartphones, wearables, and IoT sensors-and remote cloud infrastructure, collaborative learning enables the deployment of sophisticated AI capabilities in real-world, resource-constrained settings without compromising user experience or data sovereignty. This survey presents a comprehensive exploration of collaborative learning systems, focusing on the architectural designs, algorithmic strategies, and practical implementations that facilitate efficient synergy between on-device and cloud-based intelligence. We begin by discussing the foundational motivations for collaborative learning, including the limitations of pure cloud or edge-based approaches, and highlight the key design considerations such as model partitioning, communication overhead, energy constraints, and privacy risks. We systematically examine state-of-the-art techniques, including split learning, federated inference, knowledge distillation, and dynamic inference pipelines, and evaluate their effectiveness across diverse application domains such as healthcare, autonomous vehicles, augmented reality, and smart homes. Further, we delve into the mathematical underpinnings that govern collaborative optimization, including formal models for latency-accuracy trade-offs, communication cost minimization, and privacy-preserving inference. We explore how these models inform the development of adaptive and context-aware learning strategies that respond dynamically to changing network conditions, user preferences, and device capabilities. In addition, we survey system-level innovations, including lightweight runtime environments, efficient model compression schemes, and secure communication protocols that underpin scalable deployment. Through detailed case studies, we illustrate the practical relevance and transformative impact of collaborative learning, while also identifying persistent challenges and open research problems. These include issues related to lifelong learning, personalization, model synchronization, standardization, and robust evaluation benchmarks. We conclude by outlining future directions for advancing the field, including the integration of reinforcement learning for real-time orchestration, the use of generative AI for model adaptation, and the development of unified frameworks that ensure transparency, accountability, and inclusivity in collaborative AI systems. In synthesizing the current landscape and projecting future trajectories, this survey aims to provide researchers, practitioners, and system designers with a deep and nuanced understanding of collaborative learning frameworks, ultimately guiding the development of next-generation AI systems that are efficient, scalable, and aligned with human-centric values.
Chen et al. (Tue,) studied this question.