Echo state networks (ESNs) are reservoir computing (RC) variants that offer comparable accuracy to standard recurrent neural networks at less cost for training, lower computation effort, and higher speed for inference. These characteristics make echo state networks highly suitable for resource-constrained edge implementations. In this article, we present an FPGA-based streamlined dataflow architecture for ESN inference. The accelerator quantizes all layers of the ESN and follows a direct logic implementation style that fully unrolls all computations. We introduce two variants of the accelerator, one that maps neurons to DSP blocks in the FPGA and another one that maps neurons solely to LUTs. We further elaborate on a tool flow to set up, optimize, and train an ESN model for a given dataset and then automatically generate the accelerator designs to be loaded onto an FPGA. We evaluate our accelerators on a number of time-series prediction and classification tasks and compare the errors and accuracies, respectively, for a 32-bit floating-point software baseline and our accelerators at different levels of quantization. We then compare our accelerator with prior work on FPGA-based implementations for ESN and with embedded GPU and CPU platforms. Our experiments show that our accelerators are resource-intensive but excel in latency, throughput, and energy efficiency. For an ESN model with 200 reservoir neurons, we achieve a latency of \(9.5\;\;ns\) , a throughput of \(100\) Megasamples/second, and a power-delay-product of \(72\;\;nWs\) . This outperforms all previous FPGA work and, compared to embedded GPU and CPU platforms, represents improvements in the order of several magnitudes.
Jafari et al. (Mon,) studied this question.