What does this research mean for the field?

The HiCache hierarchical timestep-aware caching framework significantly accelerates diffusion transformers and avoids severe performance degradation at cache ratios exceeding 50%, outperforming existing learning-based caching methods. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

To enhance the caching strategies for diffusion transformers to improve acceleration during image generation.

April 3, 2026Open Access

HiCache: hierarchical timestep-aware caching for diffusion transformer acceleration

Key Points

To enhance the caching strategies for diffusion transformers to improve acceleration during image generation.
Proposed a hierarchical caching framework called HiCache.
Introduced timestep block cascade learning to partition diffusion time steps into hierarchies.
Developed a semantic-guided cache loss for dynamic gating during inference.
HiCache significantly outperformed advanced diffusion samplers and previous methods.
Utilizing cache ratios over 50% prevented performance degradation seen with other approaches.

Abstract

Abstract Diffusion transformers have achieved remarkable success in image generation but incur high sampling costs. Existing caching strategies typically fix the cache ratio at a low and limited value, failing to fully exploit the acceleration potential. To address this limitation, we propose HiCache, a unified hierarchical timestep-aware caching framework. We first propose timestep block cascade learning (TBCL), which partitions the diffusion time steps hierarchically into coarse-grained parent blocks and fine-grained child blocks. This hierarchical strategy allows cache constraints to be inherited across blocks, significantly increasing the cache ratio. Based on this, we propose semantic-guided cache loss (SGCL), a semantic-aware dynamic gating mechanism. This design maintains consistency between the training and inference processes and introduces minimal additional computational overhead during inference. Experimental results demonstrate that HiCache significantly outperforms advanced diffusion samplers and previous learning-based caching methods at the same inference speed. Moreover, when using cache ratios that exceed 50%, HiCache avoids the severe performance degradation observed in previous methods.

Mark Helpful

Bookmark

Relay

View Full Paper