ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2025, Vol. 57 ›› Issue (6): 929-946.doi: 10.3724/SP.J.1041.2025.0929 cstr: 32110.14.2025.0929

• 第二十七届中国科协年会学术论文 •    下一篇

当AI“具有”人格:善恶人格角色对大语言模型道德判断的影响

焦丽颖1(), 李昌锦2, 陈圳2, 许恒彬2, 许燕2()   

  1. 1北京林业大学人文社会科学学院心理学系, 北京 100083
    2北京师范大学心理学部, 应用实验心理北京市重点实验室, 心理学国家级实验教学示范中心〔北京师范大学〕, 北京 100875
  • 收稿日期:2024-10-23 发布日期:2025-04-15 出版日期:2025-06-25
  • 通讯作者: 焦丽颖, E-mail: jiaoliying316@163.com;
    许燕, E-mail: xuyan@bnu.edu.cn
  • 基金资助:
    教育部人文社会科学研究青年基金项目(24YJC190012);国家自然科学基金面上项目(31671160);国家社科基金重大项目(19ZDA363)

When AI “possesses” personality: Roles of good and evil personalities influence moral judgment in large language models

JIAO Liying1(), LI Chang-Jin2, CHEN Zhen2, XU Hengbin2, XU Yan2()   

  1. 1Department of Psychology, School of Humanities and Social Sciences, Beijing Forestry University, Beijing 100083, China
    2Beijing Key Laboratory of Applied Experimental Psychology, National Demonstration Center for Experimental Psychology Education (Beijing Normal University), Faculty of Psychology, Beijing Normal University, Beijing 100875, China
  • Received:2024-10-23 Online:2025-04-15 Published:2025-06-25

摘要:

在科技与道德的交汇点, 大语言模型是否具有扮演善恶人格的能力, 以及这一能力是否会影响其在道德判断任务中的表现至关重要。研究聚焦大语言模型在模拟不同善恶人格时的道德判断特征及其与人类模式的异同。通过2个研究, 对ERNIE 4.0和GPT-4大语言模型观测值(N = 4832)及人类被试数据(N = 370)分析发现:(1)大语言模型能成功模拟不同水平的善恶人格; (2)善恶人格设定显著影响大语言模型的道德判断结果; (3)善恶人格在人机一致中展现差序性:善人格发挥着更重要的作用(善恶人格间差序), 且其中尽责诚信的影响力最大(善恶人格内差序)。研究建构了道德判断下大语言模型善恶人格的理论模型, 有助于理解大语言模型人格如何在道德判断中发挥作用, 为推动人工智能系统的道德对齐提供了理论基础和支持。

关键词: 大语言模型, 善恶人格, 道德判断, 人机一致, 人格差序

Abstract:

The rapid advancement of artificial intelligence (AI) has raised significant ethical concerns, particularly regarding the moral decision-making capabilities of large language models (LLMs). One intriguing aspect is the potential for LLMs to exhibit characteristics akin to human personalities, which may influence the LLMs’ moral judgment. Understanding how personality traits, especially the moral traits, influence these decisions is crucial for developing AI systems that align with human ethical standards. Therefore, this study aims to explore how the roles of good and evil personalities shape the moral decision-making of LLMs, providing insights that are essential for the ethical development of AI.

This study investigated the roles of good and evil personalities in shaping the moral decision-making of the ERNIE 4.0 and GPT-4. Good personality was characterized by traits such as conscientiousness and integrity, altruism and dedication, benevolence and amicability, and tolerance and magnanimity. Evil personality encompassed traits such as atrociousness and mercilessness, mendacity and hypocrisy, calumniation and circumvention, and faithlessness and treacherousness. Study 1 analyzed 4000 observations. Specific prompts corresponding to different personality dimensions were designed. After specifying the type of personality, ERNIE 4.0 completed a self-report scale for good and evil personalities, evaluated whether the descriptions matched the current personality traits and provided a numerical rating indicating the degree of agreement. Study 2 recruited 370 human participants and utilized 832 LLM observations, investigated the roles of good and evil personalities in shaping the moral decision-making of the LLMs and compared with human results.

Significant score differences were observed across all eight personality dimensions, with high-level manipulations significantly higher than low-level manipulations. These results demonstrate LLMs’ ability to express levels of good and evil personality traits. A comparative analysis was conducted between human participants and LLMs to evaluate the impact of these traits on CAN model in Study 2. Results showed that the patterns of personality’s influence on moral judgment exhibited both similarities and differences between LLMs and humans. GPT-4's good personality manipulation aligns closely with human results, while ERNIE 4.0 scored higher than humans on sensitivity to consequences (C), sensitivity to moral norms (N), overall action/inaction preferences (A) parameters, and utilitarianism (U). GPT-4 demonstrated better moral alignment compared to ERNIE 4.0. Furthermore, a theoretical model of good and evil personality traits in LLMs was constructed within the domain of moral judgment.

This study demonstrated that LLMs effectively simulated varying levels of good and evil personality traits through personality prompts, which significantly influenced their moral judgments. GPT-4’s moral judgments aligned more closely with humans under good personality prompts, while ERNIE 4.0 consistently scored higher than humans across moral judgment indicators. Under evil personality prompts, GPT-4 exhibited lower moral norm sensitivity and higher action tendency and utilitarianism. Additionally, the influence of personality on GPT-4’s moral judgment was stronger than on ERNIE 4.0. The impact of good and evil personalities on moral judgment showed hierarchical differences, with good personality traits, particularly conscientiousness, playing a more critical role in achieving human-AI alignment in moral judgments. This research provided valuable insights into enhancing AI ethical decision-making by integrating nuanced personality traits, guiding the development of more socially responsible AI systems.

Key words: Large Language Models, good and evil personalities, moral judgment, human-AI consistency, personality hierarchy

中图分类号: