Brain Mechanisms Underlying Visual Speech Priming-Induced Unmasking of Speech Recognition

吴超; 李量   

  1. 北京大学 心理学系及北京市行为与精神健康重点实验室
    北京市海淀区海淀路52号王克桢楼1318室 邮编:100080
  • Online:2016-12-31 Published:2016-12-31


PURPOSE: Under a noisy “cocktail-party” condition with multiple-people talking, listeners often ask the talker to repeat the attended sentence for their better recognition. The “say-it-again”-induced improvement of speech recognition is partially caused by visual-speech priming (VSP), based on working memory of lipreading signals of previously-presented sentences. This study was to discover the most specific brain substrates underlying the unmasking effect of VSP.
METHODS: Using functional magnetic-resonance imaging and psychoacoustic methods, this study investigated the brain substrates underlying the unmasking effect of VSP in 16 healthy subjects.
RESULTS: The results showed that two categories of brain regions were activated by the listening condition with VSP: (1) the unmasking-correlated regions, including the left inferior temporal gyrus (ITG), left middle cingulate cortex and right putamen, whose VSP-induced activation was significantly correlated with VSP-induced improvement of target-speech recognition; (2) the regions without the correlation between brain activation and speech-recognition improvement, including the right ITG, right fusiform cortex, and right caudate. The brain structures with VSP-related functional connectivity to the unmasking-correlated regions were those involved in semantic priming, speech production, or irrelevant-signal suppression, including the right pars triangularis of the inferior frontal gyrus (IFG), left pars orbitalis of the IFG, left putamen, right caudate, left and right insular cortices, and right anterior cingulate cortex.
CONCLUSIONS: the speech-unmasking effect of VSP is based on the functional integration of two brain networks: (1) the unmasking-correlated network that specifically mediates perceptual integration and motor representation of visual-speech and auditory-speech signals, and (2) the unmasking-nonspecific network that generally suppresses irrelevant masking-speech signals. This two-network strategy for speech unmasking may be generally applicable to scene analyses under noisy “cocktail-party” conditions.

Key words: lipreading, unmasking, inferior temporal gyrus