A global comparative study of 8 languages on 5 continents finds that people overwhelmingly like to help one another, independent of differences in language, culture or environment. This is a surprising finding from the perspective of anthropological and economic research, which has tended to foreground differences in how people work together and share resources.
A. Overview of included languages with dataset size in hours and top 3 sequentially identified response tokens as transcribed in the corpus. B. Location of largest speech community. C. Assessing the impact of sparse data on UMAP projections using three samples of Dutch response tokens. A look at the full dataset (a) and random-sampled subsets of decreasing size (b, c) suggests isomorphism across scales and interpretability of clustering solutions as small as 150 tokens.
Under the auspices of various language documentation projects, language resources have been collected in more and more communities across the world, and these often include at least some conversational data. Such corpora harbour important insights for language science and technology. This map plots >60 corpora found to offer at least some conversational data.
A Across the Indo-European language family, the proportion of rough words with /r/ is much higher than the proportion of smooth words with /r/; B Each dot represents a language (size of the circle = number of words); whiskers show 95% Bayesian credible intervals corresponding to the mixed-effects Bayesian logistic regression analysis indicating that rough words have a much higher proportion of /r/ (posterior mean = 63%) than smooth words (posterior mean= 35%).
Map accompanying news coverage of our study of the link between /r/ and rough textures. The red data points represent languages that often feature /r/ in words with words for rough textures but not words for smooth textures. Blue data points, much rarer, are cases where the pattern is the reverse. The map shows that overwhelmingly, languages prefer to express rough meanings with /r/ sounds (if they have them).
Languages (and researchers) contributing to a large comparative study of the differential coding of perception across cultures. Locations indicate field sites where data were collected.
A word like huh? —used to initiate repair when, for example, one has not clearly heard what someone just said— is found in roughly the same form and function in conversational corpora from 31 spoken languages from across the globe. The ten in bold are examined in phonetic detail and found to be about as similar to each other as variants of the word dog across English varieties. Languages 11–20 are from [14], 21–31 from sources cited. Locations are approximate. 1. Cha‘palaa ʔa:↘ 2. Icelandic ha 3. Spanish e↗ 4. Siwu ã:↗ 5. Dutch h
↗ 6. Italian ε:↗ 7. Russian a:↗ 8. Lao hã:↗ 9. Mandarin Chinese ã:↗ 10. Murrinh-Patha a:↗ 11. ‡Âkhoe Hai//om hε↗ 12. Chintang hã↗ 13. Duna ɛ̃:↗ 14. English hã↗ 15. French ɛ̃:↗ 16. Hungarian hm↗/ha↗ 17. Kri ha:↗ 18. Tzeltal hai↗ 19. Yélî Dnye ɛ̃:↗ 20. Yurakaré æ↗ 21. Lahu hãi
[38] 22. Tai/Lue há↗ [92] 23. Japanese e↗ [93] 24. Korean e↗ [94] 25. German hɛ̃ [95] 26. Norwegian hæ↗ [96] 27. Herero e↗ [97] 28. Kikongo e↗ [98] 29. Tzotzil e↗ [99] 30. Bequia Creole ha:↗ [100] 31. Zapotec aj↗ [101].