Low-light image enhancement (LLIE) is an essential task for improved image quality that ultimately supports crucial downstream tasks such as autonomous driving and mobile photography. Despite notable advances achieved by traditional, Retinex-based methods, existing approaches still struggle to maintain globally consistent illumination and to suppress sensor noise under extremely dark conditions. To overcome these limitations, we propose a noise-resilient LLIE framework that integrates a CLIP-guided loss (CLIP-LLA) and a pixel-reordering subsampling (PRS) scheme into the Retinexformer backbone. The CLIP-LLA loss exploits the semantic prior of a large-scale vision–language model to align enhanced outputs within the manifold of well-illuminated natural images, leading to faithful global tone rendering and perceptual realism. In parallel, the PRS-based multi-scale training strategy effectively regularizes the network by augmenting structural diversity, thereby improving denoising capability without architectural modification or inference cost. Extensive experiments on both sRGB and RAW benchmarks validate the effectiveness of our design. The proposed method achieves consistent improvements over state-of-the-art techniques, including a +2.73dB PSNR gain on the SMID dataset and superior perceptual scores, while maintaining computational efficiency. These results demonstrate that fusing foundation-model priors with transformer-based Retinex frameworks offers a practical and scalable pathway toward perceptually faithful low-light image enhancement.
Seongjong Song (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: