A global comparative study of 8 languages on 5 continents finds that people overwhelmingly like to help one another, independent of differences in language, culture or environment. This is a surprising finding from the perspective of anthropological and economic research, which has tended to foreground differences in how people work together and share resources.

Rossi, G., Dingemanse, M., Floyd, S., Baranova, J., Blythe, J., Kendrick, K. H., Zinken, J., & Enfield, N. J. (2023). Shared cross-cultural principles underlie human prosocial behavior at the smallest scale. Scientific Reports, 13(1), 6057. Download
Sampling response tokens

A. Overview of included languages with dataset size in hours and top 3 sequentially identified response tokens as transcribed in the corpus. B. Location of largest speech community. C. Assessing the impact of sparse data on UMAP projections using three samples of Dutch response tokens. A look at the full dataset (a) and random-sampled subsets of decreasing size (b, c) suggests isomorphism across scales and interpretability of clustering solutions as small as 150 tokens.

Liesenfeld, A., & Dingemanse, M. (2022). Bottom-up discovery of structure and variation in response tokens (‘backchannels’) across diverse languages. Proceedings of Interspeech 2022, 1126–1130. Download

A survey of conversational corpora

Under the auspices of various language documentation projects, language resources have been collected in more and more communities across the world, and these often include at least some conversational data. Such corpora harbour important insights for language science and technology. This map plots >60 corpora found to offer at least some conversational data.

Liesenfeld, A., & Dingemanse, M. (2022). Building and curating conversational corpora for diversity-aware language science and technology. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), 1178–1192. Download
/r/ for rough in Indo-European

A Across the Indo-European language family, the proportion of rough words with /r/ is much higher than the proportion of smooth words with /r/; B Each dot represents a language (size of the circle = number of words); whiskers show 95% Bayesian credible intervals corresponding to the mixed-effects Bayesian logistic regression analysis indicating that rough words have a much higher proportion of /r/ (posterior mean = 63%) than smooth words (posterior mean= 35%).

Winter, B., Sóskuthy, M., Perlman, M., & Dingemanse, M. (2022). Trilled /r/ is associated with roughness, linking sound and touch across spoken languages. Scientific Reports, 12(1), 1035. Download

Rolling /r/ around the world

Map accompanying news coverage of our study of the link between /r/ and rough textures. The red data points represent languages that often feature /r/ in words with words for rough textures but not words for smooth textures. Blue data points, much rarer, are cases where the pattern is the reverse. The map shows that overwhelmingly, languages prefer to express rough meanings with /r/ sounds (if they have them).

Winter, B., Sóskuthy, M., Perlman, M., & Dingemanse, M. (2022). Trilled /r/ is associated with roughness, linking sound and touch across spoken languages. Scientific Reports, 12(1), 1035. Download

Language of perception

Languages (and researchers) contributing to a large comparative study of the differential coding of perception across cultures. Locations indicate field sites where data were collected.

Majid, A., Roberts, S. G., Cilissen, L., Emmorey, K., Nicodemus, B., O’Grady, L., Woll, B., LeLan, B., de Sousa, H., Cansler, B. L., Shayan, S., de Vos, C., Senft, G., Enfield, N. J., Razak, R. A., Fedden, S., Tufvesson, S., Dingemanse, M., Ozturk, O., … Levinson, S. C. (2018). Differential coding of perception in the world’s languages. Proceedings of the National Academy of Sciences, 115(45), 11369–11376. Download
‘Huh?’ around the world

A word like huh? —used to initiate repair when, for example, one has not clearly heard what someone just said— is found in roughly the same form and function in conversational corpora from 31 spoken languages from across the globe. The ten in bold are examined in phonetic detail and found to be about as similar to each other as variants of the word dog across English varieties. Languages 11–20 are from [14], 21–31 from sources cited. Locations are approximate. 1. Chapalaa 2. Icelandic ha 3. Spanish e↗ 4. Siwu ã:↗ 5. Dutch h↗ 6. Italian ε:↗ 7. Russian a:↗ 8. Lao hã:↗ 9. Mandarin Chinese ã:↗ 10. Murrinh-Patha a:↗ 11. ‡Âkhoe Hai//om hε↗ 12. Chintang hã↗ 13. Duna 14. English hã↗ 15. French 16. Hungarian hm↗/ha↗ 17. Kri ha:↗ 18. Tzeltal hai↗ 19. Yélî Dnye 20. Yurakaré æ↗ 21. Lahu hãi[38] 22. Tai/Lue hy ˘↗/há↗ [92] 23. Japanese e↗ [93] 24. Korean e↗ [94] 25. German h[95] 26. Norwegian h↗ [96] 27. Herero e↗ [97] 28. Kikongo e↗ [98] 29. Tzotzil e↗ [99] 30. Bequia Creole ha:↗ [100] 31. Zapotec aj↗ [101].

Dingemanse, M., Torreira, F., & Enfield, N. J. (2013). Is “Huh?” a universal word? Conversational infrastructure and the convergent evolution of linguistic items. PLOS ONE, 8(11), e78273.