This study presents a texture-aware image synthesis framework designed to generate material-consistent façades using adversarial learning. The proposed architecture incorporates a mask-guided channel-wise attention mechanism that adaptively merges segmentation information with texture statistics to reconcile structural guiding with textural fidelity. A thorough comparative analysis was performed utilizing three internal variants—Vanilla GAN, Wasserstein GAN (WGAN), and WGAN-GP—against leading baselines, including TextureGAN and Pix2Pix. The assessment utilized a comprehensive multi-metric framework that included SSIM, FID, KID, LPIPS, and DISTS, in conjunction with a VGG-19 based perceptual loss. Experimental results indicate a notable divergence between pixel-wise accuracy and perceptual realism; although established baselines attained elevated PSNR values, the suggested Vanilla GAN and WGAN models exhibited enhanced perceptual fidelity, achieving the lowest LPIPS and DISTS scores. The WGAN-GP model, although theoretically stable, produced smoother but less complex textures due to the regularization enforced by the gradient penalty term. Ablation investigations further validated that the attention mechanism consistently enhanced structural alignment and texture sharpness across all topologies. Thus, the study suggests that Vanilla GAN and WGAN architectures, enhanced by attention-based fusion, offer an optimal balance between realism and structural fidelity for high-frequency texture creation applications.
Şanlıalp et al. (Wed,) studied this question.