System-on-Chip (SoC) platforms combining programmable logic (PL) and general-purpose processors (PS) are increasingly used in low-latency and real-time applications running in edge computing platforms, thanks to their computing capabilities, low cost, and modest energy consumption. However, their latency performance is often not fully understood in practical deployments. To address such an issue, this paper presents an experimental study of latency and throughput performance on a Zynq-7030 SoC system, explicitly targeting communications between PS and PL implemented through AXI DMA. The analysis is based on measurements conducted on a real hardware platform. By evaluating different architectural and software design choices, this manuscript provides actionable guidance for designing low-latency PS–PS or PS–PL communications on SoC devices. The presented results highlight that latency and throughput can be protocol-limited or hardware-limited, depending on the selected communication architecture. Finally, we present and evaluate an industrial case study to demonstrate the potential of the usage of SoC devices in terms of achievable performance. Although the experimental evaluation is performed on a specific hardware platform, the methodology and conclusions apply more broadly to heterogeneous SoCs relying on shared-memory DMA and Linux-based and/or real time execution environments.
Cuccagna et al. (Thu,) studied this question.