Blog Archives

Auditory motion direction encoding in auditory cortex and high-level visual cortex

The aim of this functional magnetic resonance imaging (fMRI) study was to identify human brain areas that are sensitive to the direction of auditory motion. Such directional sensitivity was assessed in a hypothesis-free manner by analyzing fMRI response patterns across the entire brain volume using a spherical-searchlight approach. In addition, we assessed directional sensitivity in three predefined brain areas that have been associated with auditory motion perception in previous neuroimaging studies. These were the primary auditory cortex, the planum temporale and the visual motion complex (hMT/V5+). Our whole-brain analysis revealed that the direction of sound-source movement could be decoded from fMRI response patterns in the right auditory cortex and in a high-level visual area located in the right lateral occipital cortex. Our region-of-interest-based analysis showed that the decoding of the direction of auditory motion was most reliable with activation patterns of the left and right planum temporale. Auditory motion direction could not be decoded from activation patterns in hMT/V5+. These findings provide further evidence for the planum temporale playing a central role in supporting auditory motion perception. In addition, our findings suggest a cross-modal transfer of directional information to high-level visual cortex in healthy humans. Hum Brain Mapp, 2011. © 2011 Wiley-Liss, Inc.

from Human Brain Mapping

Functional activation for imitation of seen and heard speech

This study examined fMRI activation when perceivers either passively observed or observed and imitated matched or mismatched audiovisual (“McGurk”) speech stimuli. Greater activation was observed in the inferior frontal gyrus (IFG) overall for imitation than for perception of audiovisual speech and for imitation of the McGurk-type mismatched stimuli than matched audiovisual stimuli. This unique activation in the IFG during imitation of incongruent audiovisual speech may reflect activation associated with direct matching of incongruent auditory and visual stimuli or conflict between category responses. This study provides novel data about the underlying neurobiology of imitation and integration of AV speech.

from the Journal of Neurolinguistics

The development of multisensory speech perception continues into the late childhood years

Observing a speaker’s articulations substantially improves the intelligibility of spoken speech, especially under noisy listening conditions. This multisensory integration of speech inputs is crucial to effective communication. Appropriate development of this ability has major implications for children in classroom and social settings, and deficits in it have been linked to a number of neurodevelopmental disorders, especially autism. It is clear from structural imaging studies that there is a prolonged maturational course within regions of the perisylvian cortex that persists into late childhood, and these regions have been firmly established as being crucial to speech and language functions. Given this protracted maturational timeframe, we reasoned that multisensory speech processing might well show a similarly protracted developmental course. Previous work in adults has shown that audiovisual enhancement in word recognition is most apparent within a restricted range of signal-to-noise ratios (SNRs). Here, we investigated when these properties emerge during childhood by testing multisensory speech recognition abilities in typically developing children aged between 5 and 14 years, and comparing them with those of adults. By parametrically varying SNRs, we found that children benefited significantly less from observing visual articulations, displaying considerably less audiovisual enhancement. The findings suggest that improvement in the ability to recognize speech-in-noise and in audiovisual integration during speech perception continues quite late into the childhood years. The implication is that a considerable amount of multisensory learning remains to be achieved during the later schooling years, and that explicit efforts to accommodate this learning may well be warranted.

from the European Journal of Neuroscience

Behavioral semantics of learning and crossmodal processing in auditory cortex: The semantic processor concept

Two phenomena of auditory cortex activity have recently attracted attention, namely that the primary field can show different types of learning-related changes of sound representation and that during learning even this early auditory cortex is under strong multimodal influence. Based on neuronal recordings in animal auditory cortex during instrumental tasks, in this review we put forward the hypothesis that these two phenomena serve to derive the task-specific meaning of sounds by associative learning. To understand the implications of this tenet, it is helpful to realize how a behavioral meaning is usually derived for novel environmental sounds. For this purpose, associations with other sensory, e.g. visual, information are mandatory to develop a connection between a sound and its behaviorally relevant cause and/or the context of sound occurrence. This makes it plausible that in instrumental tasks various non-auditory sensory and procedural contingencies of sound generation become co-represented by neuronal firing in auditory cortex. Information related to reward or to avoidance of discomfort during task learning, that is essentially non-auditory, is also co-represented. The reinforcement influence points to the dopaminergic internal reward system, the local role of which for memory consolidation in auditory cortex is well-established. Thus, during a trial of task performance, the neuronal responses to the sounds are embedded in a sequence of representations of such non-auditory information. The embedded auditory responses show task-related modulations of auditory responses falling into types that correspond to three basic logical classifications that may be performed with a perceptual item, i.e. from simple detection to discrimination, and categorization. This hierarchy of classifications determine the semantic “same-different” relationships among sounds. Different cognitive classifications appear to be a consequence of the design of a learning task and lead to a recruitment of different excitatory and inhibitory mechanisms and to distinct spatiotemporal metrics of map activation to represent a sound.

from Hearing Research

Deviant processing of letters and speech sounds as proximate cause of reading failure: a functional magnetic resonance imaging study of dyslexic children

Learning to associate auditory information of speech sounds with visual information of letters is a first and critical step for becoming a skilled reader in alphabetic languages. Nevertheless, it remains largely unknown which brain areas subserve the learning and automation of such associations. Here, we employ functional magnetic resonance imaging to study letter–speech sound integration in children with and without developmental dyslexia. The results demonstrate that dyslexic children show reduced neural integration of letters and speech sounds in the planum temporale/Heschl sulcus and the superior temporal sulcus. While cortical responses to speech sounds in fluent readers were modulated by letter–speech sound congruency with strong suppression effects for incongruent letters, no such modulation was observed in the dyslexic readers. Whole-brain analyses of unisensory visual and auditory group differences additionally revealed reduced unisensory responses to letters in the fusiform gyrus in dyslexic children, as well as reduced activity for processing speech sounds in the anterior superior temporal gyrus, planum temporale/Heschl sulcus and superior temporal sulcus. Importantly, the neural integration of letters and speech sounds in the planum temporale/Heschl sulcus and the neural response to letters in the fusiform gyrus explained almost 40% of the variance in individual reading performance. These findings indicate that an interrelated network of visual, auditory and heteromodal brain areas contributes to the skilled use of letter–speech sound associations necessary for learning to read. By extending similar findings in adults, the data furthermore argue against the notion that reduced neural integration of letters and speech sounds in dyslexia reflect the consequence of a lifetime of reading struggle. Instead, they support the view that letter–speech sound integration is an emergent property of learning to read that develops inadequately in dyslexic readers, presumably as a result of a deviant interactive specialization of neural systems for processing auditory and visual linguistic inputs.

from Brain

When hearing the bark helps to identify the dog: Semantically-congruent sounds modulate the identification of masked pictures

We report a series of experiments designed to assess the effect of audiovisual semantic congruency on the identification of visually-presented pictures. Participants made unspeeded identification responses concerning a series of briefly-presented, and then rapidly-masked, pictures. A naturalistic sound was sometimes presented together with the picture at a stimulus onset asynchrony (SOA) that varied between 0 and 533 ms (auditory lagging). The sound could be semantically congruent, semantically incongruent, or else neutral (white noise) with respect to the target picture. The results showed that when the onset of the picture and sound occurred simultaneously, a semantically-congruent sound improved, whereas a semantically-incongruent sound impaired, participants’ picture identification performance, as compared to performance in the white-noise control condition. A significant facilitatory effect was also observed at SOAs of around 300 ms, whereas no such semantic congruency effects were observed at the longest interval (533 ms). These results therefore suggest that the neural representations associated with visual and auditory stimuli can interact in a shared semantic system. Furthermore, this crossmodal semantic interaction is not constrained by the need for the strict temporal coincidence of the constituent auditory and visual stimuli. We therefore suggest that audiovisual semantic interactions likely occur in a short-term buffer which rapidly accesses, and temporarily retains, the semantic representations of multisensory stimuli in order to form a coherent multisensory object representation. These results are explained in terms of Potter’s (1993) notion of conceptual short-term memory.

from Cognition

Interaction of speech and script in human auditory cortex: Insights from neuro-imaging and effective connectivity

In addition to visual information from the face of the speaker, a less natural, but nowadays extremely important visual component of speech is its representation in script. In this review, neuro-imaging studies are examined which were aimed to understand how speech and script are associated in the adult “literate” brain. The reviewed studies focused on the role of different stimulus and task factors and effective connectivity between different brain regions. The studies will be summarized in a neural mechanism for the integration of speech and script that can serve as a basis for future studies addressing (the failure of) literacy acquisition. In this proposed mechanism, speech sound processing in auditory cortex is modulated by co-presented visual letters, depending on the congruency of the letter–sound pairs. Other factors of influence are temporal correspondence, input quality and task instruction. We present results showing that the modulation of auditory cortex is most likely mediated by feedback from heteromodal areas in the superior temporal cortex, but direct influences from visual cortex are not excluded. The influence of script on speech sound processing occurs automatically and shows extended development during reading acquisition. This review concludes with suggestions to answer currently still open questions to get closer to understanding the neural basis of normal and impaired literacy.

from Hearing Research

The influence of visual and auditory information on the perception of speech and non-speech oral movements in patients with left hemisphere lesions

Abstract
Patients with lesions of the left hemisphere often suffer from oral-facial apraxia, apraxia of speech, and aphasia. In these patients, visual features often play a critical role in speech and language therapy, when pictured lip shapes or the therapist’s visible mouth movements are used to facilitate speech production and articulation. This demands audiovisual processing both in speech and language treatment and in the diagnosis of oral-facial apraxia. The purpose of this study was to investigate differences in audiovisual perception of speech as compared to non-speech oral gestures. Bimodal and unimodal speech and non-speech items were used and additionally discordant stimuli constructed, which were presented for imitation. This study examined a group of healthy volunteers and a group of patients with lesions of the left hemisphere. Patients made substantially more errors than controls, but the factors influencing imitation accuracy were more or less the same in both groups. Error analyses in both groups suggested different types of representations for speech as compared to the non-speech domain, with speech having a stronger weight on the auditory modality and non-speech processing on the visual modality. Additionally, this study was able to show that the McGurk effect is not limited to speech.

from Clinical Linguistics and Phonetics

Speech Perception as a Multimodal Phenomenon

ABSTRACT—Speech perception is inherently multimodal. Visual speech (lip-reading) information is used by all perceivers and readily integrates with auditory speech. Imaging research suggests that the brain treats auditory and visual speech similarly. These findings have led some researchers to consider that speech perception works by extracting amodal information that takes the same form across modalities. From this perspective, speech integration is a property of the input information itself. Amodal speech information could explain the reported automaticity, immediacy, and completeness of audiovisual speech integration. However, recent findings suggest that speech integration can be influenced by higher cognitive properties such as lexical status and semantic context. Proponents of amodal accounts will need to explain these results.

from Current Directions in Psychological Science