Unlocking the role of the superior temporal gyrus for speech sound categorization
the ability to almost effortlessly encode the phonemic content of running speech is a remarkable capacity of the human brain. This remarkable capacity is emphasized by the brain’s maintenance of phonemic stability despite pronounced variability in the spectral and temporal characteristics of a given phoneme and a phoneme’s frequent acoustic overlap with other speech sounds. For instance, vowels map out into discrete regions of acoustic space when the second formant frequency (F2) is plotted against the first formant frequency (F1). However, F2 and F1 values for a given vowel vary widely across speakers, and distributions of F1 vs. F2 values for one vowel often overlap significantly with those from others (e.g., /ae/ as in “head” and /ε/ as in “hayed”) (Hillenbrand et al. 1995). In total, a whole host of sources, including a dynamic environmental background, increase variability and diminish reliable mapping of phonemes based on consistently available acoustic cues. Instead of this one-to-one assignment of specific acoustic cues with a given phoneme, the brain’s task must be one of rapid categorization, placing acoustically variable speech sounds into discrete phonemic categories (see Holt and Lotto 2010 for review).
from the Journal of Neurosphysiology