The layout of the image and the positional distribution of the objects directly affect the audience’s visual focus and the effect of the message conveyed. By setting up a specific layout structure, the viewer’s attention can be more focused on the product launched or the moral of the advertisement. Existing work related to position-controllable text-to-image generation has made great progress in generating results on simple images. However, when generating images of complex scenes, the quality is often poor. This can result in the model failing to accurately convey the message of the advertisement when used to generate advertisements. To address these limitations, we propose LAYOBJ-GAN, a novel two-stage framework for layout-controllable advertisement image synthesis. Unlike prior work, our method explicitly models background layouts jointly with object layouts during the text-to-layout generation stage, enabling comprehensive spatial planning of complex scenes. Technically, we introduce a Transformer-based sequence-to-sequence layout generator that learns long-range dependencies between textual descriptions and both background and object regions, which has not been explored in previous advertisement-oriented text-to-image frameworks. In the layout-to-image stage, we further propose a fine-grained text–layout interaction normalization module (TL-Norm) that enables object knowledge transfer from a pre-trained category-to-image model, allowing object appearance to be adaptively modulated by textual context and layout constraints. Extensive experiments on MS-COCO and a high-definition advertisement dataset (AsHQ-10K) demonstrate that LAYOBJ-GAN significantly outperforms seven state-of-the-art methods in image quality, layout controllability, and semantic object accuracy. These results confirm the effectiveness of explicitly modeling background layouts and transferring object-level generative knowledge for complex advertisement image synthesis.
Xu et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: