We present FourierNAT, a novel non-autoregressive transformer (NAT) architecture that leverages Fourier-based mixing in the decoder to generate output sequences in parallel. While traditional NAT approaches often face challenges in capturing global dependencies, our method uses a discrete Fourier transform with learned frequency-domain gating to mix token embeddings across the entire sequence dimension. This design enables efficient propagation of context without explicit autoregressive steps. Empirically, FOURIERNAT achieves competitive results on WMT14 En-De and CNN/DailyMail benchmarks, highlighting that frequency-domain operations can mitigate coherence gaps often associated with NAT generation. Our approach underscores the potential of integrating spectral-domain operations to accelerate and improve parallel text generation.
Andrew Kiruluta (Thu,) studied this question.