Abstract Partially manipulated images pose a growing threat to the reliability of online content. The rapid spread of diffusion-based inpainting tools has made the creation of such manipulations increasingly easy to perform. As a result, the multimedia forensics community is disadvantaged compared to the attackers, as developing effective localization techniques often requires the creation of large datasets, a resource-intensive process due to the necessary human effort. In this paper, we present Beyond the Brush+ + (BtB++), a fully automated pipeline for generating large-scale datasets of realistic inpainted images. Our experiments demonstrate that BtB++ is both flexible and easily integrates different models and configurations, offering the adaptability required to address evolving models and application scenarios. Moreover, an automatic filtering mechanism ensures quality control by discarding low-quality generated images. To provide an initial assessment of the proposed filtering strategy, we also conducted a small-scale human evaluation, studying the alignment between human perceptual judgments and the automatic metrics used for filtering.
Bertazzini et al. (Sat,) studied this question.