March 3, 2026

Domain-Complementary Prior With Fine-Grained Feedback for Scene Text Image Super-Resolution

Key Points

Super-resolution significantly enhances scene text image quality, boosting recognition accuracy.
Improved performance metrics were observed across multi-scale text images and diverse scenarios.
Analysis involved the Fine-Grained Feedback Domain-Complementary Network, focusing on landmark image refinement.
This novel approach highlights the need for more comprehensive datasets to support challenging text scenarios.

Abstract

Enhancing the resolution of scene text images is a critical preprocessing step that can substantially improve the accuracy of downstream text recognition in low-quality images. Existing methods primarily rely on auxiliary text features to guide the super-resolution process. However, these features often lack rich low-level information, making them insufficient for faithfully reconstructing both the global structure and fine-grained details of text. Moreover, previous methods often learn suboptimal feature representations from the original low-quality landmark images, which cannot provide precise guidance for super-resolution. In this study, we propose a Fine-Grained Feedback Domain-Complementary Network (FDNet) for scene text image super-resolution. Specifically, we first employ a fine-grained feedback mechanism to selectively refine landmark images, thereby enhancing feature representations. Then, we introduce a novel domain-trace prior interaction generator, which integrates domain-specific traces with a text prior to comprehensively complement the clear edges and structural coverage of the text. Finally, motivated by the limitations of existing datasets, which often exhibit limited scene scales and insufficient challenging scenarios, we introduce a new dataset, MDRText. The proposed dataset, MDRText, features multi-scale and diverse characteristics and is designed to support challenging text image recognition and super-resolution tasks. Extensive experiments on the MDRText and TextZoom datasets demonstrate that our method achieves superior performance in scene text image super-resolution and further improves the accuracy of subsequent recognition tasks.

Bookmark

Domain-Complementary Prior With Fine-Grained Feedback for Scene Text Image Super-Resolution

Key Points

Abstract

Cite This Study