What question did this study set out to answer?

This research aims to enhance image generation technology by improving control over artistic style and image resolution.

March 28, 2026Open Access

Deep learning image generation technology for enhancing the presentation effect of image art based on artificial intelligence

Puntos clave

This research aims to enhance image generation technology by improving control over artistic style and image resolution.
Developed StyleDiffusion-HD framework integrating Latent Diffusion Model and Style Injection Attention.
Introduced Super-Resolution module utilizing Flow Matching for image enhancement.
Evaluated using multi-source high-quality artistic datasets across multiple dimensions including generation quality and aesthetic evaluations.
Proposed method outperforms mainstream models on Fréchet Inception Distance, CLIP Score, and Style Loss metrics.
Achieved high scores in subjective assessments from experts and the public.
Demonstrated effective improvement in artistic presentation and style consistency of generated images.

Resumen

Currently, Image Generation Technology (IGT) based on Artificial Intelligence (AI) and Deep Learning (DL) has demonstrated enormous potential in the field of artistic creation. However, it still has obvious shortcomings in the precise control of artistic style and the fidelity of high-resolution output. To address the existing issues of AI IGT in artistic creation, including inaccurate style control, limited resolution, and loss of artistic texture during Super-Resolution (SR) processing, this study proposes an innovative framework named StyleDiffusion-HD. The framework integrates a Latent Diffusion Model (LDM) based on Style Injection Attention (SIA) to achieve precise bimodal control of text and visual style. It also introduces an SR module based on Flow Matching (FM), which improves image resolution while maintaining style consistency. Experiments are conducted using multi-source high-quality artistic datasets, with evaluations performed from multiple dimensions, including generation quality, style consistency, image-text alignment, and subjective aesthetics. Experimental results show that the proposed method outperforms mainstream models on objective metrics including Fréchet Inception Distance (FID), CLIP Score (CS), and Style Loss (SL), and achieves high scores in subjective evaluations by experts and the general public, verifying its effectiveness and practicability in improving the artistic presentation of images. This study provides a feasible technical path to address the key challenges in current AI art generation, and offers practical references for the development of high-quality AI-assisted artistic creation.

Me gusta

Guardar

Ver artículo completo