November 9, 2025Open Access

Open Source State-Of-the-Art Solution for Romanian Speech Recognition

Key Points

Our Romanian automatic speech recognition system showed superior performance across all evaluation benchmarks, achieving state-of-the-art results.
By reducing word error rates by up to 27%, the system outperformed previous best-performing models in transcribing Romanian speech.
Utilizing a hybrid decoding approach, we explored various strategies to enhance transcription accuracy using the FastConformer architecture.
The findings highlight the system's applicability for low-latency automatic speech recognition tasks and practical deployment in real-time applications.

Abstract

In this work, we present a new state-of-the-art Romanian Automatic Speech Recognition (ASR) system based on NVIDIA's FastConformer architecture--explored here for the first time in the context of Romanian. We train our model on a large corpus of, mostly, weakly supervised transcriptions, totaling over 2,600 hours of speech. Leveraging a hybrid decoder with both Connectionist Temporal Classification (CTC) and Token-Duration Transducer (TDT) branches, we evaluate a range of decoding strategies including greedy, ALSD, and CTC beam search with a 6-gram token-level language model. Our system achieves state-of-the-art performance across all Romanian evaluation benchmarks, including read, spontaneous, and domain-specific speech, with up to 27% relative WER reduction compared to previous best-performing systems. In addition to improved transcription accuracy, our approach demonstrates practical decoding efficiency, making it suitable for both research and deployment in low-latency ASR applications.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Gabriel Pîrlogeanu

Universitatea Națională de Știință și Tehnologie Politehnica București

Alexandru-Lucian Georgescu

Universitatea Națională de Știință și Tehnologie Politehnica București

Horia Cucu

Universitatea Națională de Știință și Tehnologie Politehnica București

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Open Source State-Of-the-Art Solution for Romanian Speech Recognition

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study