Los puntos clave no están disponibles para este artículo en este momento.
Recently, significant improvement has been achieved for hardware architecture design of deep neural networks (DNNs). However, the hardware implementation of one widely used softmax function in DNNs has not been much investigated, which involves expensive division and exponentiation units. This paper performs an efficient hardware implementation of softmax function. Mathematical transformations and linear fitting are used to simplify this function. Multiple algorithmic strength reduction strategies and fast addition methods are employed to optimize the architecture. By using these techniques, complicated logic units like multipliers are eliminated and the memory consumption is largely reduced while the accuracy loss is negligible. The proposed design is coded using hardware description language (HDL) and synthesized under the TSMC 28-nm CMOS technology. Synthesis results show that the architecture achieves a throughput of 6.976 G/s for 8-bit input data. The power efficiency of 463.04 Gb/(mm 2 · mW) is achieved and it costs only 0.015mm 2 area resources. To the best of our knowledge, this is the first work on efficient hardware implementation for softmax in open literature.
Building similarity graph...
Analyzing shared references across papers
Loading...
Meiqi Wang
Sun Yat-sen University
Siyuan Lu
Nanjing University
Danyang Zhu
Northwestern University
Nanjing University
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Mon,) studied this question.
synapsesocial.com/papers/6a15953e814bf8ec9a4ecd7d — DOI: https://doi.org/10.1109/apccas.2018.8605654