What question did this study set out to answer?

This work aims to enhance the integration of frequency analysis in spiking neural networks for improved computational efficiency.

December 10, 2025Open Access

ASFNOformer—A Superior Frequency Domain Token Mixer in Spiking Transformer

Key Points

This work aims to enhance the integration of frequency analysis in spiking neural networks for improved computational efficiency.
Proposed Adaptive Spiking Fourier Neural Operator Transformer architecture for SNNs.
Utilized Fast Fourier Transform across spatial and temporal dimensions.
Incorporated Multi-Layer Perceptron mechanism with block-diagonal weight matrix.
Optimized Leaky Integrate-and-Fire neurons with Learnable Weight Parameters.
Reduced parameter count by approximately 25%.
Achieved comparable accuracy to mainstream models on static datasets.
Showcased superior performance on neuromorphic datasets by effectively capturing frequency features.
Confirmed model generalizability through ablation studies.

Abstract

As the third generation of neural networks, Spiking Neural Networks (SNNs) simulate the event-driven processing mode of the brain, offering superior energy efficiency and biological interpretability compared to traditional deep learning. Combining the architectural strengths of Transformers with SNNs has recently demonstrated high accuracy and significant potential. SNNs process binary spikes and rich temporal information, resulting in lower computational complexity and making them particularly suitable for neuromorphic datasets. However, neuromorphic data typically involve dynamic edges and high-frequency pixel intensity changes. Capturing this frequency information is challenging for traditional spatial methods but is critical for event-driven vision. To address this, we investigate the integration of the Fast Fourier Transform (FFT) into SNNs and propose the Adaptive Spiking Fourier Neural Operator Transformer (ASFNOformer). This architecture adapts the Adaptive Fourier Neural Operator (AFNO)—originally validated in Artificial Neural Networks (ANNs)—specifically for the spiking domain. Unlike standard AFNOs, our module applies FFT across both spatial (H, W) and temporal (T) dimensions, followed by a Multi-Layer Perceptron structure (MLP) mechanism with a block-diagonal weight matrix. This design effectively captures both spatial features and temporal dynamics inherent in event streams. Furthermore, we incorporate Leaky Integrate-and-Fire (LIF) neurons optimized with Learnable Weight Parameters (LWP-LIF) to enhance temporal feature extraction and adaptivity. Experimental results on standard benchmarks indicate that our method reduces the parameter count by approximately 25%. In terms of recognition accuracy, ASFNOformer is comparable to mainstream models on static datasets and demonstrates superior performance on neuromorphic datasets by efficiently capturing frequency features. Notably, ablation studies confirm the model’s generalizability, and when using QKformer as a baseline, our method achieves state-of-the-art (SOTA) performance on the CIFAR10-DVS dataset. This work advances frequency-domain analysis in SNNs, paving the way for efficient deployment on neuromorphic hardware.

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper