June 1, 2022Open Access

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation

Key Points

Key points are not available for this paper at this time.

Abstract

This paper proposes a new transformer-based framework to learn class-specific object localization maps as pseudo labels for weakly supervised semantic segmentation (WSSS). Inspired by the fact that the attended regions of the one-class token in the standard vision transformer can be leveraged to form a class-agnostic localization map, we investigate if the transformer model can also effectively capture class-specific attention for more discriminative object localization by learning multiple class tokens within the transformer. To this end, we propose a Multi-class Token Transformer, termed as MCTformer, which uses multiple class tokens to learn interactions between the class tokens and the patch tokens. The proposed MCTformer can successfully produce class-discriminative object localization maps from the class-to-patch attentions corresponding to different class tokens. We also propose to use a patch-level pairwise affinity, which is extracted from the patch-to-patch transformer attention, to further refine the localization maps. Moreover, the proposed framework is shown to fully complement the Class Activation Mapping (CAM) method, leading to remarkably superior WSSS results on the PASCAL VOC and MS COCO datasets. These results underline the importance of the class token for WSSS. 1 1 https://github.com/xulianuwa/MCTformer

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Lian Xu

The University of Western Australia

Wanli Ouyang

Australian National University

Mohammed Bennamoun

The University of Western Australia

Actions

Institutions

The University of Sydney

The University of Western Australia

Hong Kong University of Science and Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study