What question did this study set out to answer?

The aim is to enhance Kuzushiji transcription by fusing optical character recognition and read-speech automatic speech recognition.

April 10, 2026Open Access

Kuzushiji Transcription Using Hiragana-Level Fusion of Optical Character Recognition and Read-Speech Automatic Speech Recognition

Key Points

The aim is to enhance Kuzushiji transcription by fusing optical character recognition and read-speech automatic speech recognition.
Developed a new transcription framework combining OCR and ASR.
Utilized the transcriber’s read-speech to guide OCR selection.
Employed beam-search for hypothesis scoring based on phonetic similarity.
Achieved a lower character error rate compared to traditional OCR-only methods.
Demonstrated effectiveness without additional model training.

Abstract

This letter proposes a new Kuzushiji transcription framework that integrates optical character recognition (OCR) with read-speech automatic speech recognition (ASR) via hiragana-level fusion, without requiring additional model training. The framework uses the transcriber’s read-speech as an additional modality to guide beam-search OCR hypothesis selection for Kuzushiji transcription. Each OCR candidate is scored based on its phonetic similarity to the ASR output of the corresponding Kuzushiji read-speech at the hiragana-sequence level. Evaluation results show the effectiveness of the proposed framework in reducing the character error rate in contrast to conventional OCR-only Kuzushiji transcription.

Bookmark

View Full Paper

Bookmark

View Full Paper

Kuzushiji Transcription Using Hiragana-Level Fusion of Optical Character Recognition and Read-Speech Automatic Speech Recognition

Key Points

Abstract

Cite This Study