Key points are not available for this paper at this time.
Video action recognition using 3D Convolutional Neural Networks (CNN) become an increasingly popular strategy in past years with the evolution of machine learning and computer vision. However, the higher memory and computation capacity requirement of these networks leads to the use of low-power memory-saving neural networks to perform video action recognition tasks efficiently. Spike-based information processing and computation of bio-inspired spiking convolutional neural networks perform an essential role when comes to energy efficient memory saving computation for video classification and action recognition tasks which allow on-chip real-time processing. This paper proposes a novel 3D Convolutional Spiking Neural Network (CSNN) architecture with modulating STDP supervised learning via global error feedback for human action recognition in video data. The proposed model includes two 3D convolutional layers, followed by two spiking neuron layers, modeled using Leaky Integrate and Fire (LIF) neurons for feature extraction from video data. Using the modulating STDP learning rule with global error feedback, this model can successfully recognize human actions from video data allowing online parallel computations. The proposed network experimented on two datasets: one 3D image dataset - synthesized 3D MNIST and one video dataset - UCF 101 human action recognition dataset and achieved 71.6% and 63.7% recognition accuracy.
Nawarathne et al. (Mon,) studied this question.
Synapse has enriched 4 closely related papers on similar clinical questions. Consider them for comparative context: