Key points are not available for this paper at this time.
Image harmonization is an essential technique in computer vision, aiming to generate visually consistent composite images by making the foreground compatible with the background. However, current methods primarily focus on applying a global transformation perspective, overlooking the fact that different regions in a real image can exhibit significant appearance variations. Yet, there is consistency within local regions. They also have limited representation ability by using fixed background statistics (e.g., mean, and standard deviation) for foreground normalization. Hence, we propose a hierarchical dynamics appearance translation strategy that adjusts the foreground appearance based on the corresponding background, adapting the model features and parameters from local to global view. To enhance the representation ability for targets, we employ a mixed attention mechanism for local dynamics, which adaptively modifies the features of different channels and positions. Additionally, we apply dynamic region-aware convolution guided by the foreground mask for global dynamics, which learns the adaptive representation of the foreground and background and correlations to global harmonization. To further improve the harmonization result, we integrate adversarial and perceptual loss into the model training. Experiments show our method significantly reduces parameters and achieves state-of-the-art performance compared with previous methods.
Ju et al. (Mon,) studied this question.
Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context: