Hilfe Wegweiser Datenschutzhinweis Impressum Kontakt





A two-pass strategy for handling OOVs in a large vocabulary recognition task

Es sind keine MPG-Autoren in der Publikation vorhanden
Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)

(Verlagsversion), 120KB

Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Scharenborg, O., & Seneff, S. (2005). A two-pass strategy for handling OOVs in a large vocabulary recognition task. In Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, (pp. 1669-1672). ISCA Archive.

This paper addresses the issue of large-vocabulary recognition in a specific word class. We propose a two-pass strategy in which only major cities are explicitly represented in the first stage lexicon. An unknown word model encoded as a phone loop is used to detect OOV city names (referred to as rare city names). After which SpeM, a tool that can extract words and word-initial cohorts from phone graphs on the basis of a large fallback lexicon, provides an N-best list of promising city names on the basis of the phone sequences generated in the first stage. This N-best list is then inserted into the second stage lexicon for a subsequent recognition pass. Experiments were conducted on a set of spontaneous telephone-quality utterances each containing one rare city name. We tested the size of the N-best list and three types of language models (LMs). The experiments showed that SpeM was able to include nearly 85% of the correct city names into an N-best list of 3000 city names when a unigram LM, which also boosted the unigram scores of a city name in a given state, was used.