The bandwidth improvement provided by high-bandwidth memory (HBM), and the capability of FPGAs to customize the processing and memory hierarchy, results in a considerable performance increase for memory-intensive workloads such as graph processing, sorting, machine learning, and database analytics. Modern systems integrating 3D-stacked DRAM memory can be leveraged to realize the Near-Memory Computing (NMC) paradigm by offloading some computations to accelerators placed near the HBM. Matrix-vector multiplication (MVM) kernels, which are memory-bound, can significantly benefit from being executed on FPGA-HBM platforms. MVM kernels can be broadly categorized into two types: dense (General Matrix-Vector Multiplication, GEMV) and sparse (Sparse Matrix-Vector Multiplication, SpMV). Recent literature has predominantly focused on optimizing SpMV for FPGA-HBM, leaving a unified solution relatively unexplored. In this work, we introduce an end-to-end framework for compiling MVM kernels for FPGA-HBM Acceleration. It consists of a software and a hardware components. The software component introduces the MATIO compiler, a novel toolflow for detecting MVM and matrix multiplication (MM) kernels in C or C++ code, and replacing MVM kernels with a call to our FPGA accelerator. MATIO is capable of detecting 90% of MVM and MM kernels in real-world benchmarks collected from Github. Additionally, it is faster than state-of-the-art detection methods by at least 45x. On the hardware side, we introduce VecMADS, a novel FPGA architecture designed to efficiently handle both GEMV and SpMV operations. Our architecture leverages the high memory bandwidth of HBM to overcome memory bottlenecks, providing a comprehensive solution for accelerating matrix-vector multiplication on FPGAs. Evaluation results show that VecMADS delivers 1.5x higher throughput and 4.8x higher energy efficiency compared to cuSPARSE library on GPU. Considering dense benchmarks, VecMADS achieves 1.26x higher throughput than the hipBLAS library running on GPU.
Iskandar et al. (Mon,) studied this question.