Key points are not available for this paper at this time.
Graph neural networks (GNNs) have shown excellent performance in a wide range of applications such as recommendation, risk control, and drug discovery. With the increase in the volume of graph data, distributed GNN systems become essential to support efficient GNN training. However, existing distributed GNN training systems suffer from various performance issues including high network communication cost, low CPU utilization, and poor end-to-end performance. In this paper, we propose ByteGNN, which addresses the limitations in existing distributed GNN systems with three key designs: (1) an abstraction of mini-batch graph sampling to support high parallelism, (2) a two-level scheduling strategy to improve resource utilization and to reduce the end-to-end GNN training time, and (3) a graph partitioning algorithm tailored for GNN workloads. Our experiments show that ByteGNN outperforms the state-of-the-art distributed GNN systems with up to 3.5--23.8 times faster end-to-end execution, 2--6 times higher CPU utilization, and around half of the network communication cost.
Building similarity graph...
Analyzing shared references across papers
Loading...
Chenguang Zheng
Chinese University of Hong Kong
Hongzhi Chen
Northeastern University
Yuxuan Cheng
Northwestern Polytechnical University
Proceedings of the VLDB Endowment
Peking University
Chinese University of Hong Kong
Building similarity graph...
Analyzing shared references across papers
Loading...
Zheng et al. (Tue,) studied this question.
synapsesocial.com/papers/69de8cce57c7c8340a558aea — DOI: https://doi.org/10.14778/3514061.3514069