May 30, 2024Open Access

In situ neighborhood sampling for large-scale GNN training

Key Points

Key points are not available for this paper at this time.

Abstract

Graph Neural Network (GNN) training algorithms commonly perform neighborhood sampling to construct fixed-size mini-batches for weight aggregation on GPUs. State-of-the-art disk-based GNN frameworks compute sampling on the CPU, transferring edge partitions from disk to memory for every mini-batch. We argue that this design incurs significant waste of PCIe bandwidth, as entire neighborhoods are transferred to main memory only to be discarded after sampling. In this paper, we make the first step towards an inherently different approach that harnesses near-storage compute technology to achieve efficient large-scale GNN training. We target a single machine with one or more SmartSSD devices and develop a high-throughput, epoch-wide sampling FPGA kernel that enables pipelining across epochs. When compared to a baseline random-access sampling kernel, our solution achieves up to 4.26× lower sampling time per epoch.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Song et al. (Thu,) studied this question.

www.synapsesocial.com/papers/68e67a8cb6db643587604495 — DOI: https://doi.org/10.1145/3662010.3663443

Authors

Yuhang Song

Po Hao Chen

Yuchen Lu

Actions

Institutions

John Brown University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage Processing Architectures· 2022 · 2 citations
Advances in neural information processing systems 7· 1996 · 14,389 citations
Large-Scale Learnable Graph Convolutional Networks· 2018 · 457 citations
Fundamentals of Brain Network Analysis· 2016 · 903 citations
One trillion edges· 2015 · 404 citations

In situ neighborhood sampling for large-scale GNN training

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Also consider