April 29, 2024

Credible Diffusion: Improving Diffusion Models Interpretability with Transformer Embeddings

Key Points

Key points are not available for this paper at this time.

Abstract

Diffusion models have recently emerged as an innovative topic in computer vision, providing outstanding results in generative modeling. This paper introduces a novel approach to enhancing the interpretability and accountability of diffusion models in generative image tasks. By integrating a transformer-based encoder-decoder mechanism, we propose a methodology that employs deterministic degradation operators, derived from dataset labels or associated textual content, as an alternative to traditional random Gaussian noise. This method enables precise attribution of the generated images to their sources within the training data. Through extensive experiments on a subset of the Fashion-MNIST dataset, we demonstrate the model's capability to perfectly reconstruct the textual citations while achieving close approximation in image reconstruction. Despite the observed limitations in diversity, our findings indicate a significant potential for controlled image synthesis based on textual descriptions. This work lays the foundation for advancing the interpretability of generative AI models, paving the way for more transparent and accountable generative applications.

Bookmark

Credible Diffusion: Improving Diffusion Models Interpretability with Transformer Embeddings

Key Points

Abstract

Cite This Study

Also Consider

Also Consider