April 8, 2024

Efficient Semi-Supervised Object Segmentation for Long-Term Videos Using Adaptive Memory Network

SZShan ZhongTongji University GLGuoqiang LiHenan Cancer Hospital WYWenhao YingFudan University

Key Points

Key points are not available for this paper at this time.

Abstract

Video object segmentation uses the first annotated video mask to achieve consistent and precise segmentation in subsequent frames. Recently, memory-based methods have received significant attention owing to their substantial performance enhancements. However, these approaches rely on a fixed global memory strategy, which poses a challenge to segmentation accuracy and speed in the context of longer videos. To alleviate this limitation, we propose a novel semi-supervised video object segmentation model, founded on the principles of the adaptive memory network. Our proposed model adaptively extracts object features by focusing on the object area while effectively filtering out extraneous background noise. An identification mechanism is also thoughtfully applied to discern each object in multiobject scenarios. To further reduce storage consumption without compromising the saliency of object information, the outdated features residing in the memory pool are compressed into salient features through the employment of a self-attention mechanism. Furthermore, we introduce a local matching module, specifically devised to refine object features by fusing the contextual information from historical frames. We demonstrate the efficiency of our approach through experiments, substantially augmenting both the speed and precision of segmentation for long-term videos, while maintaining comparable performance for short videos. The source code is available at https://github.com/GQLi-cong/AMN .

Ask AI

Helpful

Bookmark

View Full Paper