Fast prototyping of Quantized neural networks on an FPGA edge computing device with Brevitas and FINN

Key Points

Key points are not available for this paper at this time.

Abstract

In this paper, we propose a solution for fast proto-typing of Deep learning neural network models on edge computing devices like FPGA for researchers with limited knowledge of high level languages like VHDL. We use Xilinx' Brevitas tool for Quantization and FINN framework for deployment/inference on Pynq-Z2 board. The paper will also share presently available methods for FPGA prototyping and how tools like Brevitas and FINN can be used for more efficient inference of DNN on small scale edge computers like FPGA by levaraging their 1. Quantization Aware Training(QAT) and Post Training Quanti-zation(PTQ) 2. Streamlining networks and transformations 3. Dataflow partitioning of the NN model using FINN compiler 4. DMA, FIFO and IP generation for HW build and 5. Inference on FPGA using PYNQ python Driver. The weights and activations of a custom model were quantised from floating points to 8, 4 and 2 bit for which an accuracy drop of 0.1 %, 0.8% and 7.6% was observed respectively.

Bookmark

Cite This Study

Chawda et al. (Tue,) studied this question.

synapsesocial.com/papers/68e61c93b6db6435875aead1 https://doi.org/https://doi.org/10.1109/icufn61752.2024.10625618

Bookmark