What question did this study set out to answer?

This research aims to improve the interpretability of CLIP through a gradient-guided attribution method.

May 6, 2026Open Access

Class-disentangled attribution with gradient guidance for CLIP

Key Points

This research aims to improve the interpretability of CLIP through a gradient-guided attribution method.
Developed a gradient-guided class-aware semantic disentanglement attribution approach for CLIP.
Explicitly disentangled class-related semantics in global tokens during attribution.
Introduced a gradient guidance strategy for effective spatial attribution.
Proposed method generates more stable and faithful visual explanations.
Demonstrated improvement in explanation fidelity and reliability compared to existing methods.

Abstract

Abstract Large-scale vision–language pre-trained models such as CLIP play a central role in modern multimodal artificial intelligence. However, their cross-modal decision process remains difficult to interpret, which limits reliable deployment in practical applications. Existing explanation methods for CLIP often exhibit semantic entanglement and inaccurate spatial localization in cross-modal attribution. This paper presents a gradient-guided class-aware semantic disentanglement attribution method for CLIP. The proposed method explicitly disentangles class-related semantics aggregated in the global token during attribution. This design suppresses irrelevant semantic interference and produces visual explanations with improved semantic consistency, clearer structural organization, and more accurate spatial localization. We further introduce a novel gradient guidance strategy that balances importance assignment at the channel level and guides spatial attribution toward regions that are discriminative for the target semantics. As a result, the proposed approach generates more stable and faithful visual explanations. Extensive qualitative and quantitative experiments on ImageNet and MSCOCO 2014 demonstrate that the proposed method consistently outperforms existing approaches in explanation fidelity and reliability.

Class-disentangled attribution with gradient guidance for CLIP

Key Points

Abstract

Cite This Study