What type of study is this?

This is a Experimental Study study.

October 10, 2025Open Access

Self-Speculative Masked Diffusions

Key Points

The new method achieves a ~2x reduction in network forward passes compared to standard models.
By modifying the transformer attention mask, non-factorized predictions can be generated effectively.
The speculative sampling mechanism allows for draft token generation that enhances efficiency.
This approach can be applied to text modeling and protein sequence generation, significantly improving computational efficiency.

Abstract

We present self-speculative masked diffusions, a new class of masked diffusion generative models for discrete data that require significantly fewer function evaluations to generate samples. Standard masked diffusion models predict factorized logits over currently masked positions. A number of masked positions are then sampled, however, the factorization approximation means that sampling too many positions in one go leads to poor sample quality. As a result, many simulation steps and therefore neural network function evaluations are required to generate high-quality data. We reduce the computational burden by generating non-factorized predictions over masked positions. This is achieved by modifying the final transformer attention mask from non-causal to causal, enabling draft token generation and parallel validation via a novel, model-integrated speculative sampling mechanism. This results in a non-factorized predictive distribution over masked positions in a single forward pass. We apply our method to GPT2 scale text modelling and protein sequences generation, finding that we can achieve a ~2x reduction in the required number of network forward passes relative to standard masked diffusion models.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Andrew Campbell

Valentin De Bortoli

Jiaxin Shi

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Self-Speculative Masked Diffusions

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider