Key points are not available for this paper at this time.
High resolution pixel processing (PP) tasks like demosaicing, denoising, and super-resolution strongly benefit from Convo-lutional Neural Network (CNN) approaches, yet give rise to different architectural challenges compared to typical classification CNNs, preventing real-time execution on existing SotA CNN processors. The 12nm DepFiN processor is the first processor optimized for a wide range of pixel processing CNNs, innovating 1.) at system level, by shifting the memory/IO trade-off through line-based extreme layer fusion (depth-first CNN computation); 2.) at architecture level, by achieving near 100% utilization @3.8TOPs, even on traditionally challenging depthwise (DW)-pointwise (PW) and ShiftNet layers, through an optimized dataflow for high resolution PP; 3.) at gate level, by drastically reducing switching activity through improved register file (REGF) and Multiply-Accumulate (MAC) interfacing.DepFiN brings high-resolution pixel processing to mobile platforms by achieving up to 3.8 TOPs and 20 TOPs/W, e.g. enabling 2x super-resolution to with the FSRCNN network while limiting off-chip IO to 3.2M features/inference.
Goetschalckx et al. (Sun,) studied this question.