Enhancing the resolution of scene text images is a critical preprocessing step that can substantially improve the accuracy of downstream text recognition in low-quality images. Existing methods primarily rely on auxiliary text features to guide the super-resolution process. However, these features often lack rich low-level information, making them insufficient for faithfully reconstructing both the global structure and fine-grained details of text. Moreover, previous methods often learn suboptimal feature representations from the original low-quality landmark images, which cannot provide precise guidance for super-resolution. In this study, we propose a Fine-Grained Feedback Domain-Complementary Network (FDNet) for scene text image super-resolution. Specifically, we first employ a fine-grained feedback mechanism to selectively refine landmark images, thereby enhancing feature representations. Then, we introduce a novel domain-trace prior interaction generator, which integrates domain-specific traces with a text prior to comprehensively complement the clear edges and structural coverage of the text. Finally, motivated by the limitations of existing datasets, which often exhibit limited scene scales and insufficient challenging scenarios, we introduce a new dataset, MDRText. The proposed dataset, MDRText, features multi-scale and diverse characteristics and is designed to support challenging text image recognition and super-resolution tasks. Extensive experiments on the MDRText and TextZoom datasets demonstrate that our method achieves superior performance in scene text image super-resolution and further improves the accuracy of subsequent recognition tasks.
Zhang et al. (Thu,) studied this question.