Pulse Journal Club Active Debates Trending Explore Researchers

Join discussions, follow papers, and never miss your next session.

Download on theApp Store

© Synapse Social LLC, 2026

Home Explore Journal Club Trending

⌘+K

Visual Prompt Tuning for Weakly Supervised Phrase Grounding | Synapse

March 18, 2024Open Access

Visual Prompt Tuning for Weakly Supervised Phrase Grounding

Key Points

Key points are not available for this paper at this time.

Abstract

Previous works on the task of weakly supervised phrase grounding (WSG) rely heavily on object detectors providing RoIs for the localization. However, such methods cannot be applied effectively to real-world scenarios largely because that the detectors are trained with limited categories. In this paper, we propose a refinement-based approach to WSG through fine-tuning a detector-free phrase grounding model with a visual prompt. This visual prompt is extracted from the text-related representations in CLIP. Furthermore, we combine the visual prompt with learnable features and then fine-tune the grounding network. Our experimental results significantly outperform state-of-the-art methods on the WSG task and shows the effectiveness of our method.

Read Full Paperexternally

Ask AI

Helpful

Bookmark

Share

View Full Paper

Ask AI

Helpful

Bookmark

Share

View Full Paper

Cite This Study

Lin et al. (Mon,) studied this question.

synapsesocial.com/papers/68e73894b6db6435876b2055 https://doi.org/https://doi.org/10.1109/icassp48485.2024.10445738