How frequent are interjections?

The occurrence of interjections in 10-min excerpts of informal dyadic conversations in six spoken languages. Every panel shows the turns of a dyadic exchange; colored dots indicate turns that belong to the top 10 most common one-word standalone turn formats in the language. These excerpts cannot support strong comparative or typological inferences; they are only meant to give an impression of the prevalence of interjections across unrelated languages.

Dingemanse, M. (2024). Interjections at the heart of language. Annual Review of Linguistics, 10, 257–277. doi: 10.1146/annurev-linguistics-031422-124743 PDF

Sampling response tokens

A. Overview of included languages with dataset size in hours and top 3 sequentially identified response tokens as transcribed in the corpus. B. Location of largest speech community. C. Assessing the impact of sparse data on UMAP projections using three samples of Dutch response tokens. A look at the full dataset (a) and random-sampled subsets of decreasing size (b, c) suggests isomorphism across scales and interpretability of clustering solutions as small as 150 tokens.

Liesenfeld, A., & Dingemanse, M. (2022). Bottom-up discovery of structure and variation in response tokens (‘backchannels’) across diverse languages. Proceedings of Interspeech 2022, 1126–1130. doi: 10.21437/Interspeech.2022-11288 PDF

Cultural evolution of continuers

Continuers (frequent standalone utterances like mm-hm that people often use in succession) differ in interesting ways from other elements that are common, like top tokens (the most common words in a corpus) and discontinuers (frequent standalone utterances that people do not produce in successive streaks). A. Length of tokens for continuers, discontinuers and top tokens in 32 languages. B. Frequencies of major sound classes across types. Vowel nuclei occur across types, but continuers stand out for their preferences for nasals. C. Random forest analysis of 118 continuer forms in 32 spoken languages showing the top 10 most predictive phonemes (out of 29 attested).

Dingemanse, M., Liesenfeld, A., & Woensdregt, M. (2022). Convergent cultural evolution of continuers (mmhm). The Evolution of Language: Proceedings of the Joint Conference on Language Evolution (JCoLE), 61–67. PDF

Mhmm over time

Even apparently universal patterns (like the use of ‘mhm’ during tellings) can show important cross-cultural differences. A. Continuers (marked ○) are among the most frequent recipient behaviours in both English and Korean, shown here in four 80 second stretches of tellings. B. However, the relative frequency of continuers is about twice as high in Korean based on 100 random samples of 80 second segments in both languages: on average, 21% of turns are continuers in Korean, against 9% of turns in English (measures expressed this way to control for speech rate differences).

Dingemanse, M., & Liesenfeld, A. (2022). From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 5614–5633. doi: 10.18653/v1/2022.acl-long.385 PDF

Properties and formats of repair

Using elementary properties of interactional resources, we can capture commonalities and differences between repair formats in principled and precise ways. For instance, to capture the distinctions between four repair initiation formats in English (as presented in Sidnell 2010), we can use the following three properties: Question (is there a content question word?), Repetition (does the repair initiator repeat some material from the prior turn?) and Confirmation (does the repair initiator make confirmation relevant in next turn?).

Dingemanse, M., & Enfield, N. J. (2015). Other-initiated repair across languages: towards a typology of conversational structures. Open Linguistics, 1, 98–118. doi: 10.2478/opli-2014-0007 PDF

‘Huh?’ around the world

A word like huh? —used to initiate repair when, for example, one has not clearly heard what someone just said— is found in roughly the same form and function in conversational corpora from 31 spoken languages from across the globe. The ten in bold are examined in phonetic detail and found to be about as similar to each other as variants of the word dog across English varieties. Languages 11–20 are from [14], 21–31 from sources cited. Locations are approximate. 1. Chapalaa ʔa:↘ 2. Icelandic ha 3. Spanish e↗ 4. Siwu ã:↗ 5. Dutch h↗ 6. Italian ε:↗ 7. Russian a:↗ 8. Lao hã:↗ 9. Mandarin Chinese ã:↗ 10. Murrinh-Patha a:↗ 11. ‡Âkhoe Hai//om hε↗ 12. Chintang hã↗ 13. Duna ɛ̃:↗ 14. English hã↗ 15. French ɛ̃:↗ 16. Hungarian hm↗/ha↗ 17. Kri ha:↗ 18. Tzeltal hai↗ 19. Yélî Dnye ɛ̃:↗ 20. Yurakaré æ↗ 21. Lahu hãi[38] 22. Tai/Lue há↗ [92] 23. Japanese e↗ [93] 24. Korean e↗ [94] 25. German hɛ̃ [95] 26. Norwegian hæ↗ [96] 27. Herero e↗ [97] 28. Kikongo e↗ [98] 29. Tzotzil e↗ [99] 30. Bequia Creole ha:↗ [100] 31. Zapotec aj↗ [101].

Dingemanse, M., Torreira, F., & Enfield, N. J. (2013). Is “Huh?” a universal word? Conversational infrastructure and the convergent evolution of linguistic items. PLOS ONE, 8(11), e78273. doi: 10.1371/journal.pone.0078273