Illustration of how colour space is mapped onto vowel space based on the findings for >1100 participants in Cuskley, Dingemanse et al. 2019. Red usually goes with back vowels like /a/, while light hues like yellow and green go with front vowels like /i/ and darker hues go with /u/ and /o/. None of this is deterministic: associations vary across people and this just represents one of the most common solutions on average. Made by MD for the classroom materials in Van Leeuwen & Dingemanse 2022.

Plots of where in a phonetic possibility space different words end up after 10,000 rounds of interaction, across 20 independent simulation runs (each cloud of 100 exemplar dots/triangles represents a single word at round 10,000 of a single simulation run). Blue, yellow, green and orange are regular words; purple is the continuer word. On each independent simulation run, all words are initialised at randomly selected positions in the space. A shows a selection of 6 separate simulation runs chosen for illustrative purposes (showing how regular words end up in different positions); B shows the end-state of all 20 simulation runs overlaid. Parameter settings: (i) minimal effort bias 3 times as strong for continuer word (G=1250) than for regular vocabulary words (G=5000), and (ii) the bias for reuse of features (i.e. segment-similarity bias) is not applied to the continuer category.
A Candidate continuer forms in 10 unrelated languages, B shown in their natural sequential ecology (annotations as in the original data), C with spectrograms and pitch traces of representative tokens made using the Parselmouth interface to Praat (Jadoul et al., 2018; Boersma & Weenink, 2013).
Continuers (frequent standalone utterances like mm-hm that people often use in succession) differ in interesting ways from other elements that are common, like top tokens (the most common words in a corpus) and discontinuers (frequent standalone utterances that people do not produce in successive streaks). A. Length of tokens for continuers, discontinuers and top tokens in 32 languages. B. Frequencies of major sound classes across types. Vowel nuclei occur across types, but continuers stand out for their preferences for nasals. C. Random forest analysis of 118 continuer forms in 32 spoken languages showing the top 10 most predictive phonemes (out of 29 attested).
Response tokens like English mhmm, uhuhh, yeah or Catalan mm, sí, vale are tricky to study in the wild: their phonetic realizations can be quite different from how they are transcribed. Here we use UMAP, a method for dimensionality reduction used in bioacoustics and other fields, to explore the shape of inventories of response tokens in 16 languages. Every point represents a single response token; the closer two points are the more similar they are acoustically. Spectrograms drawn around the rim of the plots provide a direct view of the acoustic structure of tokens and enable quick sanity checks.
L: The vowel space with colour associations by a synaesthete. R: The same vowels displayed according to tongue position when produced. Visualization: Christine Cuskley & Mark Dingemanse. For an interactive version of this visual, see here.
Pitch trace of a Japanese utterance starting with two tokens of the ideophone zabɯ:n ‘splash’, showing how they are produced in the upper part of the speaker’s pitch range, and how their articulation is drawn out relative to other non-ideophonic elements in the utterance. This illustrates the special treatment that ideophones often get in everyday speech, which makes them stand out from the surrounding material.
Diagram of attested cross-modal mappings to linguistic sound represented on typical vowel space. (Figure by first author Gwilym Lockwood.)