February 20, 2022

A 512Gb In-Memory-Computing 3D-NAND Flash Supporting Similar-Vector-Matching Operations on Edge-AI Devices

Key Points

Key points are not available for this paper at this time.

Abstract

Similar-vector-matching (SVM) applications for unstructured vectors that are generated via machine-learning methods, such as face search and audio texturing from a dataset for access control systems, are frequently operated on edge devices, as depicted in Fig. 7. 5. 1. The SVM operation 1–3 typically comprises of (1) in the offline phase, the extracted raw vectors (Vₑ₀ₖ) are obtained from machine learning approaches and stored in non-volatile NAND Flash; (2) in the online phase, a processor request Vₑ₀ₖ data from edge storage; (3) the entire Vₑ₀ₖ dataset is moved from storage to the processor; (4) the processor scores the similarities between an input query and each candidate Vₑ₀ₖ and provide a best match. However, the large-amount data movement across the memory hierarchy consumes a large amount of energy (E₌₄₌), while also resulting in a long search-latency (tₒₑ) for SVM operations. The entire Vₑ₀ₖ dataset includes a large amount of invalid data. To reducing data movement will lower E₌₄₌ and tₒₑ ; edge storage with nonvolatile computing-in-memory (nvCIM) support for similarity computation (vector-vector multiplication (VVM) for cosine similarity) is required to reduce the Vₑ₀ₖ dataset to a small candidate size. However, there are challenges in leveraging 3D NAND for VVM operations: (1) a low-readout accuracy when there is a large amount of current summation by using the wide range Vₓ -level of cells (e. g. , 1^st to 4^thVₓ -level of TLC cell) and (2) the large readout power consumption required to achieve a constant settling time against a wide range of summation currents for the possible data-patterns.

Mark Helpful

Bookmark

Relay