Edge Artificial Intelligence (AI) allows machine learning models to run directly on resource-constrained IoT and edge devices instead of relying only on cloud resources. This shift reduces latency and improves privacy but introduces strict constraints on memory, energy, and connectivity across heterogeneous sensor nodes. Deploying AI on constrained devices also requires addressing fragmented toolchains and inconsistent evaluation practices. This survey provides a structured review of deployment methods for edge AI, including model compilation, intermediate representations, hardware-aware optimization, scheduling strategies, and connectivity-aware execution. Post-deployment aspects are also examined, including lifecycle management, secure over-the-air updates, and benchmarking frameworks that combine latency, accuracy, and energy. Use cases in healthcare, smart cities, autonomous systems, manufacturing, and agriculture illustrate practical applications. Across these areas, common challenges in portability, reproducibility, adaptivity, and security are identified, along with emerging directions such as dynamic orchestration across heterogeneous devices and modular update mechanisms. By combining perspectives from compiler design, system optimization, and lifecycle management, this survey highlights how deployment pipelines are evolving into key enablers for reliable and scalable intelligence at the network edge.
Passarotto et al. (Thu,) studied this question.