ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

心理科学进展 ›› 2025, Vol. 33 ›› Issue (12): 2027-2042.doi: 10.3724/SP.J.1042.2025.2027 cstr: 32111.14.2025.2027

• 第二十七届中国科协年会学术论文 •    下一篇

大语言模型的人工心理理论: 证据、界定与挑战

杜传晨1, 郑远霞1,2, 郭倩倩1, 刘国雄1()   

  1. 1南京师范大学心理学院, 南京 210097
    2宁波大学教师教育学院, 浙江 宁波 315211
  • 收稿日期:2025-05-10 出版日期:2025-12-15 发布日期:2025-10-27
  • 通讯作者: 刘国雄, E-mail: 17219367@qq.com
  • 基金资助:
    国家重点研发计划(2023YFC3341301)

Artificial theory of mind in large language models: Evidence, conceptualization, and challenges

DU Chuanchen1, ZHENG Yuanxia1,2, GUO Qianqian1, LIU Guoxiong1()   

  1. 1School of Psychology, Nanjing Normal University, Nanjing 210097, China
    2College of Teacher Education, Ningbo University, Ningbo 315211, China
  • Received:2025-05-10 Online:2025-12-15 Published:2025-10-27

摘要:

传统观点认为, 心理理论是有意识的生物独有的社会认知能力。然而, 近来迅速发展的大语言模型能够解决多种心理理论任务, 这引发了关于其是否具有心理理论的激烈争议。大语言模型的人工心理理论在表现上与心理理论类似, 但内部过程不同。本文首先从评估对象和任务特征两个方面系统梳理人工心理理论研究, 通过综合分析GPT-4在心理理论任务中的较高通过率及使其表现受限的内外因素, 指出当前模型的心理理论表现与人类相似。其次, 通过深入对比支撑心理理论与人工心理理论的神经基础和发展因素, 揭示二者的内部过程存在本质差异, 进而补充了人工心理理论的定义。未来研究应关注制定并采用标准化的评估方案, 在共同心理理论框架下探究人工心理理论机制, 以及与人类心理理论对齐。

关键词: 大语言模型, 心理理论, 人工心理理论, 人工智能

Abstract:

In recent years, the rapid development of artificial intelligence (AI) has continuously reshaped our understanding of its capability boundaries. Evaluating the theory of mind capabilities of large language models (LLMs) has received significant attention within the research community. Recent studies suggest that LLMs can successfully complete tasks traditionally used to assess theory of mind in humans. However, controversial questions remain: Do LLMs possess theory of mind? What are the essential differences between artificial theory of mind and human theory of mind? Therefore, this systematic review synthesizes the performance of LLMs on theory of mind tasks, reveals essential differences in the internal processes between human theory of mind and artificial theory of mind, refines the conceptual definition of artificial theory of mind, and outlines key challenges in this field.

Specifically, we systematically synthesize research on artificial theory of mind from the objects of evaluation and the characteristics of tasks. Following the developmental sequence of core theory of mind sub-abilities in humans, we evaluate the task performance of GPT-4. Results indicate that GPT-4 can consistently pass false-belief tasks and various higher-order theory of mind tasks, suggesting a simulation of human-like theory of mind performance. Nevertheless, recognizing that behavioral task accuracy may be insufficient to reflect true capability, we specifically examine cases in which model performance is limited. Our analysis identifies critical limiting factors including external factors (e.g., limitations in test items, prompt design and human baselines) and intrinsic limitations of LLMs (e.g., hallucinations, hyperconservatism, commonsense errors, heuristics or spurious correlations, and spurious causal inference). This suggests that performance fluctuations may not stem from a lack of artificial theory of mind. Consequently, by integrating GPT-4's high accuracy with its intrinsic and extrinsic factors limiting its performance, we demonstrate that GPT-4 has developed artificial theory of mind that is similar in performance to human theory of mind.

The advanced capabilities of LLMs underscore the importance of studying the internal processes of theory of mind. Multiple lines of evidence suggest fundamental differences in the internal processes between human theory of mind and artificial theory of mind. To fully distinguish them, we conduct a comparison of the neural foundations and developmental factors supporting both. We highlight that their neural mechanisms exhibit distinct complexity across multiple dimensions, and the factors influencing theory of mind acquisition and development differ between humans and LLMs. This reveals the root cause of their internal process differences. Based on this, we define artificial theory of mind as: the simulation of human theory of mind performance exhibited by LLMs during text generation, achieved through recognizing and matching statistical patterns in the text after receiving a theory of mind task prompts.

However, as an emerging field, existing research faces three main challenges: lack of standardized evaluation protocols, unclear mechanisms of artificial theory of mind, and theory of mind alignment issues. Regarding evaluation, issues in experimental design, prompt design, and scoring methods can lead to divergent results. Regarding mechanisms, while internal processes are key to distinguishing between theory of mind and artificial theory of mind, current research has neither fully elucidated these processes nor sufficiently addressed​ how humans attribute mental states to LLMs. Regarding alignment, LLMs simulate human theory of mind performance without achieving genuine theory of mind reasoning. Based on this, we propose corresponding research directions.

In conclusion, this study reveals that LLMs possess an artificial theory of mind that is similar in performance to human theory of mind but different in internal processes, refines its conceptual definition, and clarifies key field challenges. It identifies critical issues and delineates future research directions, offering a foundation for leveraging artificial theory of mind research to advance our understanding of human theory of mind emergence and internal processes.

Key words: large language models, theory of mind, artificial theory of mind, artificial intelligence

中图分类号: