Key points are not available for this paper at this time.
•DFX: a low-latency multi-FPGA appliance for accelerating transformer-based text generation–DFX is a multi-FPGA appliance that accelerates transformer-based text generation–DFX adopts model parallelism to efficiently process the large-scale language model–Xilinx Alveo U280 data center accelerator card provides high performance with low-cost–FPGA-to-FPGA communication is enabled by QSFP cable at 100 Gb/s
Hong et al. (Sun,) studied this question.