ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

心理科学进展 ›› 2024, Vol. 32 ›› Issue (9): 1379-1392.doi: 10.3724/SP.J.1042.2024.01379

• 研究构想 •    下一篇

中文词汇语义加工过程的计算模拟与实验验证

李兴珊1,2, 张淇玮1,2, 黄林洁琼1   

  1. 1中国科学院心理研究所行为科学重点实验室, 北京 100101;
    2中国科学院大学心理学系, 北京 100039
  • 收稿日期:2023-12-06 出版日期:2024-09-15 发布日期:2024-06-26
  • 通讯作者: 李兴珊, E-mail: lixs@psych.ac.cn; 张淇玮, E-mail: zhangqw@psych.ac.cn
  • 基金资助:
    * 国家自然科学基金项目(32371156)

Computational modeling and experimental validation of Chinese lexical and semantic processing

LI Xingshan1,2, ZHANG Qiwei1,2, HUANG Linjieqiong1   

  1. 1CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100101, China;
    2Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-12-06 Online:2024-09-15 Published:2024-06-26

摘要: 中文是全球华人广泛使用的文字, 特点鲜明。由于其特异性, 西方语言理论和模型无法直接应用于中文。现有中文词汇加工研究中, 缺乏系统的计算模型来模拟词汇语义加工过程。本研究旨在通过计算建模和实验研究方法解决上述问题。研究将系统回顾中文词汇加工已有研究并进行元分析, 构建模型以模拟中文词汇在孤立呈现及句子语境中的加工过程。该模型能够加工单字词和多字词, 模拟词的形、音、义的加工过程及交互作用, 并考虑语境中上下文的影响。最后, 通过实验研究验证模型假设。本研究建立的中文词汇语义加工模型有助于理解中文阅读特异性认知机制和词汇加工的动态过程。

关键词: 认知模拟, 词汇加工, 语义加工, 中文阅读, 计算模型

Abstract: Chinese is a writing system widely used by Chinese people worldwide and has many distinct characteristics. Due to its uniqueness, theories and models of alphabetic languages cannot be directly applied to Chinese. Previous Chinese studies lack systematic computational models for lexical and semantic processing. This research proposal focuses on the Chinese lexical and semantic processing, aiming to innovate theoretical hypotheses through the meta-analysis, computational modeling, and experimental validation. The research addresses three theoretical problems through model simulations, including the mechanism of Chinese compound word processing, the interaction between orthographic, phonological, and semantic processing of words, and the impact of contextual cues on word processing during sentence comprehension.
The first study conducts a meta-analysis of prior research on Chinese compound word processing. It aims to estimate the effect sizes of morphemes, phonology, and context in compound word processing. The meta-analysis will determine an overall effect size, from which a single experimental study—whose effect size closely approximates this overall measure—will be selected as a representative study for subsequent model fitting.
The second study aims to develop a computational model for isolated Chinese word processing, focusing on holistic and local competition and module processing perspectives. This model explores the roles of morphological semantics and phonological pathways in lexical semantic processing. The model is built on two primary assumptions: One is that feedforward and feedback connections are supposed between orthographic, phonological, and semantic modules, with both the direct pathway (i.e., orthography-to-semantics) and the mediated pathway (orthography-phonology-semantics) being activated during Chinese word processing. The other assumption is that during compound word processing, the model posits that both embedded single-character words and whole compound words are activated at the orthographic, phonological, and semantic levels, and that these activations engage in competitive interactions. Model simulations will examine the dynamic changes in activation levels of different nodes over time, providing a basis to test these hypotheses.
The third study integrates the isolated word processing model developed in Study 2 with the eye movement control module from the Chinese Reading Model (Li & Pollatsek, 2020). It considers the effects of phonological and semantic processing, as well as sentence context, introducing new assumptions to construct a model of word semantic processing during sentence reading. This integrated model aims to accurately simulate word processing during sentence reading and its relationship with eye movement control. The new assumption suggests that the activation of semantic units primarily determines when to start a saccade. Furthermore, the semantics of a recognized word could maintain the highest activation and constrain the processing of subsequent words.
The fourth study contains two empirical experiments to verify the competition mechanism between single-character words and whole compound words proposed in the above models. Two experiments will respectively examine the potential competition when processing Chinese compound words in isolation and embedded in sentences. If the assumptions of models are confirmed, it will provide empirical support and deepen our understanding of Chinese word processing. If the experimental results deviate from the predictions of models, necessary revisions will be made to the model.
This research leverages the precision, systematics, and dynamic process descriptions afforded by computational models to enhance the understanding of the cognitive processes involved in Chinese word processing. The established model can guide experimental research and has theoretical significance. Meanwhile, the research combines empirical experiments with computational modeling, using experiments to validate the assumptions and predictions of the model. The outcomes of these experiments will, in turn, drive iterative improvements in the model.
In conclusion, through the integration of meta-analysis, computational modeling, and experimental studies, the current research project aims to reveal the dynamic processes involved in Chinese lexical and semantic processing, including the processing of words presented in isolation and in sentences. The findings of the current project and the models to be constructed will not only reveal the dynamic procedure of compound word processing, but also clarify the specific and general cognitive mechanisms of Chinese reading.

Key words: cognitive simulation, lexical processing, semantic processing, Chinese reading, computational modelling

中图分类号: