Preparing a corpus of Dutch spontaneous dialogues for automatic phonetic 
analysis

Schuppler, B.; Ernestus, Mirjam; Scharenborg, Odette; Boves, L.

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Conference Paper

Preparing a corpus of Dutch spontaneous dialogues for automatic phonetic analysis

MPS-Authors

/persons/resource/persons1469

Ernestus, Mirjam
Language Comprehension Group, MPI for Psycholinguistics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

56ECEFD4d01.pdf
(Publisher version), 169KB

Supplementary Material (public)

There is no public supplementary material available

Citation

Schuppler, B., Ernestus, M., Scharenborg, O., & Boves, L. (2008). Preparing a corpus of Dutch spontaneous dialogues for automatic phonetic analysis. In INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association (pp. 1638-1641). ISCA Archive.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0012-D200-5

Abstract

This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for automatic phonetic research aimed at increasing our understanding of reduction phenomena and the role of fine phonetic detail. Since the corpus was not created with automatic processing in mind, it needed to be reshaped. The first part of this paper describes the actions needed for this reshaping in some detail. The second part reports the results of a preliminary analysis of the reduction phenomena in the corpus. For this purpose a phonemic transcription of the corpus was created by means of a forced alignment, first with a lexicon of canonical pronunciations and then with multiple pronunciation variants per word. In this study pronunciation variants were generated by applying a large set of phonetic processes that have been implicated in reduction to the canonical pronunciations of the words. This relatively straightforward procedure allows us to produce plausible pronunciation variants and to verify and extend the results of previous reduction studies reported in the literature.