ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

心理科学进展 ›› 2025, Vol. 33 ›› Issue (12): 2168-2181.doi: 10.3724/SP.J.1042.2025.2168 cstr: 32111.14.2025.2168

• 研究前沿 • 上一篇    下一篇

话轮转换中的时间预期

宋庆一1,2, 蒋晓鸣1,2()   

  1. 1上海外国语大学语言科学研究院
    2上海外国语大学语言科学与多语智能应用重点实验室, 上海 201620
  • 收稿日期:2025-03-19 出版日期:2025-12-15 发布日期:2025-10-27
  • 通讯作者: 蒋晓鸣, E-mail: xiaoming.jiang@shisu.edu.cn
  • 基金资助:
    国家自然科学基金面上基金项目(32471109);与上海外国语大学导师学术引领计划项目(2023DSYL011)

Temporal prediction during turn-taking

SONG Qingyi1,2, JIANG Xiaoming1,2()   

  1. 1Institute of Language Sciences, Shanghai International Studies University
    2Key Laboratory of Language Science and Multilingual Artificial Intelligence, Shanghai International Studies University, Shanghai 201620, China
  • Received:2025-03-19 Online:2025-12-15 Published:2025-10-27

摘要:

话轮转换是对话中双方交替发言的过程, 听者需预测言者的话轮结束点以实现高效交流。尽管已有研究探讨影响结束点预测的言语线索, 其具体机制仍不明确。本文在时间预期的框架下, 综述了可用于话轮结束点预测的多种线索, 并提出话轮转换中的时间预期模型。话轮中有形式与时间结构。词汇句法信息作为形式结构, 可通过皮层−丘脑反馈调节神经振荡, 引导注意力指向结束点。韵律线索提供时间结构。大脑通过间隔同步机制提取节律, 并与神经振荡同步以表征时间。句尾韵律特征的显著变化则被以基于事件的方式表征, 通过小脑快速传导至皮层, 使双通路进入“预测模式”。视觉线索如手势也可发挥类似作用。最后, 本文指出了当前研究的局限, 并提出未来方向:1)检验不同线索在双通路中的作用; 2)探讨个体时间预期能力对话轮转换的影响; 3)研究词汇/句法与韵律线索在话轮结束点预测中的相对权重等。

关键词: 话轮转换, 时间预期, 双通路模型, 话轮结束点预测

Abstract:

Turn-taking involves the rapid alternation of speakers, with gaps between turns averaging ~200 ms. However, producing even a single word requires at least 600 ms, suggesting that speakers must prepare their responses in advance by predicting turn endings. While numerous studies have identified cues that facilitate turn-end prediction, the underlying neural mechanisms remain unclear. During turn-taking, listeners need to predict at what time the current speaker will finish their turn. In this way, turn-end prediction, in its essence, requires listeners’ temporal prediction ability. Thus, we first reviewed the studies about the neural mechanism related to time processing and prediction. Research on neural timing mechanisms distinguishes between millisecond timing, which governs event-based processing, and interval timing, which tracks longer durations (seconds to minutes). These two kinds of temporal process is related to language processing. In a proposed dual-pathway system, temporal processing is achieved in two pathways: the rapid cerebellar transmission is related to the event-based, discrete temporal processing, while the basal ganglia and striato-thalamo-cortical circuit is related to the interval-based, continuous temporal processing. The dual-pathway architecture explores how the brain processes temporal information, providing theoretical support for the neural basis of temporal prediction. However, since the model does not specify which signals enter the timing system, it remains unclear which and how speech cues are utilized to predict turn endings in conversation.

In the next part, we reviewed the cues that can be utilized by listeners to predict turn-ends. Lexico-syntactic information plays a well-established role, and despite some debate, prosodic features—particularly those at utterance-final positions—are widely recognized as predictive. Additionally, non-linguistic cues such as gaze and nodding contribute to turn-end anticipation. Although previous studies have confirmed that these cues could be utilized by speakers to predict turn-ends, they did not specify the exact role of these cues. To address this gap, we propose a temporal prediction model for turn-taking. In this model, the lexico syntactic information is transmitted linearly to the cortex via ascending pathways, which is then mapped onto meaning. The prediction of lexical-syntactic information then adjusts neural oscillations to anticipate turn endings through cortico-thalamic feedback. Meanwhile, prosodic cues are rapidly processed via the cerebellar pathway, directing cortical attention to the incoming speech signal and setting the dual-pathway system to “predictive mode”. This model integrates different types of cues for turn-end prediction and suggests a possible predictive mechanism. Within this framework, we summarize the limitations of existing research and propose further directions: future studies should 1) examine the role of different cues in dual-pathway architecture in predicting turn-ends, 2) investigate the relationship between individual temporal prediction ability and turn-taking performance, 3) use M/EEG techniques with high-temporal-resolution to study the relative weighting of lexical/syntactic and prosodic cues in turn-end prediction. 4) use free production paradigms to explore multiple cognitive processes involved in turn-taking, including comprehension, content preparation, turn-end prediction, and actual speech production.

Key words: turn-taking, temporal prediction, dual-pathway model, turn-end prediction

中图分类号: