The role of rhythm in auditory speech understanding

CHEN Liangjie, LIU Lei, GE Zhongshu, YANG Xiaodong, LI Liang()   

  1. School of Psychological and Cognitive Sciences, Peking University, Beijing 100080, China
  • Received:2021-07-07 Online:2022-08-15 Published:2022-06-23
  • Contact: LI Liang


For a long time, the research of rhythm has mainly focused on sensory and perceptual processing, ignoring its role in speech comprehension. Oral speech, as an important channel of information exchange in human society, has rich rhythmic characteristics, and its understanding is that listeners receive external speech input and obtain meaning. In daily communication, auditory speech comprehension is influenced by multiscale rhythm information. Common external rhythms showed below. Prosodic structure rhythm can affect the intelligibility of auditory speech and help listeners to analyze sentence structure in an ambiguous context. Context rhythm changes the listener's judgment of the number of words and affects the recognition of vowels and consonants in words. Body language rhythm can alter stress position perception and restore speech intelligibility. The influence of external rhythm on auditory speech comprehension exists in a wide range of auditory and non-auditory stimuli, which help the listeners to understand the speaker's speech content. The process by which the listener's brain uses external rhythms to promote or alter speech comprehension is thought to be related to internal rhythms. Internal rhythms are neural oscillations, which can represent the hierarchical characteristics of external speech input at different time scales and tend to be coupled with each other. The convergence of internal and external rhythms over time with the input of external rhythmic stimuli is called neural entrainment. Neural entrainment of external rhythmic stimuli and internal neural activity can optimize the brain's processing of speech stimuli and extract discrete language components from continuous sound signals. It's worth noting that neural entrainment is not only a passive follower of external rhythm information but also influenced by the subjective regulation of listeners. In the process of speech comprehension, top-down modulation may be derived from cognitive processes such as the listener's selective attention, prior knowledge of grammar, and expectation. They can affect neural entrainment at the same time. When the listener pays selective attention to a stream of speech, it weakened or eliminated the advanced neural response of the brain regions to not pay attention to the voice stream. And neural entrainment based on the corresponding component in the listener is expected to enhance speech representation and processing. And neural entrainment relied on the listener of the existing prior knowledge to integrate words composition between different levels or across brain areas. These active modulations make the key information in the process of speech comprehension more likely to be at the optimal excitability level of neuron cluster activity, thus obtaining more processing resources. We believe that neural entrainment may be the key mechanism to realize the interrelation between internal and external rhythms and jointly affect speech comprehension. Finally, the discovery of internal and external rhythms and their related mechanisms can provide a research window for understanding speech, a complex sequence with structural rules in multilevel time scales.

Key words: rhythm, speech understanding, neural oscillation, neural entrainment, top-down modulation

