Key points are not available for this paper at this time.
Diffusion frameworks have achieved comparable performance with previous state-of-the-art image generation models. This paper proposes DiffusionInst, a novel framework representing instances as vectors and formulates instance segmentation as a noise-to-vector denoising process. The model is trained to reverse the noisy groundtruth mask without any inductive bias from RPN. It takes a randomly generated vector as input and outputs mask with multi-step denoising during inference. Extensive experimental results on COCO and LVIS show that DiffusionInst achieves competitive performance. Our code is available at https://github.com/chenhaoxing/DiffusionInst.
Gu et al. (Mon,) studied this question.