Not All Pauses are the Same: Multidimensional Classifi cation of Pauses for the Annotation of Russian Spoken Corpora
Abstract:
We describe the Russian spoken corpora with the annotation of pauses and provide an overview of studies of pauses. We defi ne a pause as one of the elements of intonation, expressed at the phonetic level by a short break in speech or a tone change, perceived as a break. We consider the existing classifi cations of pauses and propose a new classifi cation. It includes three independent levels: 1) the physical nature of the pause (consists of two sublevels: the one that describes the pauses in terms of their temporal nature and fi lling, and the sublevel at which we determine what exactly the pauses are fi lled with), 2) position within the grammatical and semantic structure of the discourse (pauses that occur on the border of elementary discourse units or within them), 3) function (emphatic, rhythmic, situational, respiratory, hesitation pauses). We assume that such a classifi cation is most convenient for annotating speech since it considers various aspects of the description of pauses. Annotators seem to face the greatest diffi culties while assigning a pause to a particular type at the last level, since it is not always easy to understand the intention of the speaker. In the final part of the paper, we provide an overview of psycholinguistic studies of pauses and describe a specifi c research question, that we are going to study using our classifi cation of pauses.
For citation:
Acknowledgements:
We thank our colleague Vladislav Ivanovich Zubov for valuable comments at the stage of developing the classifi cation of pauses, as well as an anonymous reviewer for suggestions on fi nalizing the fi rst version of the article.