Key points are not available for this paper at this time.
Neuromorphic speech recognition systems that use spiking neural networks (SNNs) and memristors are progressing in hardware development. The conventional manual preprocessing of audio signals is shifting toward event-based recognition with convolutional SNNs. Despite achieving high accuracy in classification, the efficient extraction of spatiotemporal features from audio events continues to be a substantial challenge. In this study, we introduce dynamic time-surface neurons (DTSNs) using volatile memristors featuring an adjustable temporal kernel decay, enabled by series-connected transistors with an Au/LiCoO 2 /Au configuration. DTSNs act as feature descriptors, enhancing the spatiotemporal feature extraction from event audio data. A two-layer SNN classifier, fully connected and incorporating a 1T1R nonvolatile memristor array, is trained to recognize the spatiotemporal features of the audio data. Our findings show classification accuracies of up to 95.91%, substantial improvements in computational efficiency, and increased noise resilience, confirming the promise of our memristor-based speech recognition system for practical applications.
Wu et al. (Wed,) studied this question.