Key points are not available for this paper at this time.
This paper describes the advances made in IBM's Arabic broadcast news transcription system which was fielded in the 2006 GALE ASR and machine translation evaluation. These advances were instrumental in lowering the word error rate by 42% relative over the course of one year and include: training on additional LDC data, large-scale discriminative training on 1800 hours of unsupervised data, automatic vowelization using a flat-start approach, use of a large vocabulary with 617K words and 2 million pronunciations and lastly, a system architecture based on cross-adaptation between unvowelized and vowelized acoustic models.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hagen Soltau
Google (United States)
George Saon
Northwestern University
Brian Kingsbury
IBM (United States)
IBM (United States)
IBM Research - Thomas J. Watson Research Center
Building similarity graph...
Analyzing shared references across papers
Loading...
Soltau et al. (Sun,) studied this question.
synapsesocial.com/papers/6a1c6b202cc291e7bf2fc09e — DOI: https://doi.org/10.1109/icassp.2007.366921