What question did this study set out to answer?

To reduce character error rate in OCR output by using LLM-Guided Sequence Reconstruction.

April 28, 2026Open Access

LLM-Guided Sequence Reconstruction for Degraded OCR Output: Closing the Latency Gap with Parallel Inference

Key Points

To reduce character error rate in OCR output by using LLM-Guided Sequence Reconstruction.
Proposed LLM-Guided Sequence Reconstruction architecture for OCR output correction.
Validated on a banking document corpus of 11,368 pages.
Measured character error rate, latency overhead, and throughput under parallel inference.
Achieved significant reduction in character error rate not specified quantitatively.
Demonstrated efficient latency performance with high throughput.
Outperformed traditional fuzzy matching approaches on domain-specific terminology.

Abstract

OCR engines operating at reduced resolution produce degraded text with systematic character-level errors — broken words, misrecognized characters, and corrupted numerical sequences. Current post-processing approaches rely on fuzzy matching algorithms that achieve moderate quality at near-zero latency but fail on domain-specific terminology and structured sequences absent from predefined vocabularies. Prior work established that a Ray-based parallel OCR pipeline achieves 69.9× speedup on 11,368 banking document pages but produces a Character Error Rate of 24.78% at 100 DPI — a quality gap that fuzzy matching cannot close. This paper proposes LLM-Guided Sequence Reconstruction (LLM-GSR), a post-processing architecture that replaces dictionary-based correction with a large language model operating as a sequence predictor over degraded OCR output. The key insight is that OCR degradation produces incomplete sequences that a language model can reconstruct through next-token prediction conditioned on domain context — precisely the task for which language models are optimized. We formalize the boundary of applicability through a Reconstruction Precondition grounded in information theory, and validate the architecture on the same 11,368-page banking corpus, measuring CER reduction, latency overhead, and throughput under parallel inference.

LLM-Guided Sequence Reconstruction for Degraded OCR Output: Closing the Latency Gap with Parallel Inference

Key Points

Abstract

Cite This Study