What question did this study set out to answer?

This research aims to improve the recognition of Persian text in real-world images using advanced transformer-based techniques.

June 14, 2026Open Access

A Transformer-Based Approach with Contextual Position Encoding for Robust Persian Text Recognition in the wild

Key Points

This research aims to improve the recognition of Persian text in real-world images using advanced transformer-based techniques.
Extended vanilla transformer architecture to recognize arbitrary shapes of Persian text.
Applied Contextual Position Encoding to capture intricate details and orientations of Persian characters.
Evaluated multiple deep-learning models with a specialized Persian scene text recognition dataset.
Achieved superior word recognition accuracy compared to existing recognition methods.
Demonstrated effective handling of oriented and spaced Persian characters in wild images.

Abstract

The Persian language presents unique challenges for scene text recognition due to its distinctive script. Despite advancements in AI, recognition in non-Latin scripts like Persian still faces difficulties. In this paper, we extend the vanilla transformer architecture to recognize arbitrary shapes of Persian text instances. We apply Contextual Position Encoding (CPE) to the baseline transformer architecture to improve the recognition of Persian scripts in wild images, especially for oriented and spaced characters. The CPE utilizes position information to generate contrastive data pairs that help better in capturing Persian characters written in a different direction. Moreover, we evaluate several state-of-the-art deep-learning models using our prepared challenging Persian scene text recognition dataset and develop a transformer-based architecture to enhance recognition accuracy. Our proposed scene text recognition architecture achieves superior word recognition accuracy compared to existing methods on a real-world Persian text dataset.

KI fragen

Bookmark

View Full Paper

Cite This Study

Raisi et al. (Mon,) studied this question.

synapsesocial.com/papers/6a2e4429b1cc60ccdea89fb4 https://doi.org/https://doi.org/10.22044/jadm.2024.14669.2569

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

KI fragen

Bookmark

View Full Paper