ISSN 0439-755X
CN 11-1911/B

Acta Psychologica Sinica ›› 2026, Vol. 58 ›› Issue (7): 1237-1253.doi: 10.3724/SP.J.1041.2026.1237

• Academic Papers of the 28th Annual Meeting of the China Association for Science and Technology •     Next Articles

Personalized alignment of large language models and its impact on moral judgment

LI Chang-Jin1,2,3, JIAO Liying4, CHEN Zhen1,2,3, XU Hengbin1,2,3, WU Michael Shengtao5, XU Yan1,2,3   

  1. 1Faculty of Psychology, Beijing Normal University;
    2Beijing Key Laboratory of Applied Experimental Psychology;
    3National Demonstration Center for Experimental Psychology Education [Beijing Normal University], Beijing 100875, China;
    4Department of Psychology, School of Humanities and Social Sciences, Beijing Forestry University, Beijing 100083, China;
    5Department of Philosophy, School of Philosophy and Sociology, Jilin University, Changchun 130012, China
  • Received:2025-05-22 Published:2026-07-25 Online:2026-05-15

Abstract: With the advent of the human-machine symbiosis era, the ethical dilemmas and algorithmic biases of large language models (LLMs) have triggered widespread societal concerns. Guiding artificial intelligence (AI) toward beneficial development has thus become an urgent and challenging imperative. This research explores the impact of personalized alignment based on the HEXACO personality model on the moral judgment of LLMs. Specifically, the study aims to verify whether LLMs can effectively achieve personalized alignment through prompting and to systematically evaluate how such alignment influences utilitarian tendencies in LLMs compared to humans across various moral dilemmas. By leveraging established psychological frameworks, this research seeks to provide a scientific basis for constructing controllable and ethical AI alignment strategies.
Study 1 tested GPT-3.5, GPT-4, and ERNIE 3.5 using HEXACO-based personality prompts across six domains at high, low, and baseline levels, integrated with different gender roles. Manipulation checks were conducted using two distinct methods: a quantitative personality assessment using the HEXACO-60 scale and a qualitative personality story-writing task rated by independent human evaluators. Study 2 utilized a set of standardized moral dilemmas to assess utilitarian versus deontological choices in both LLMs and human participants. Human data were categorized into high and low personality groups for comparison, while the LLMs performed the same moral judgment tasks under various personality settings to identify shifts in decision-making patterns.
The results of Study 1 confirmed the feasibility of personalized alignment, demonstrating that LLMs can dynamically represent HEXACO personality traits through prompts. Among the LLMs tested, GPT-4 exhibited superior instruction-following capabilities and more distinct trait differentiation than GPT-3.5 and ERNIE 3.5. Findings from Study 2 revealed that personalized alignment significantly alters the moral judgment of LLMs, though the impact varies across different models and personality domains. Specifically, traits such as Honesty-Humility, Agreeableness, and Conscientiousness were found to reduce utilitarian tendencies, leading to a preference for deontological responses. While some traits, particularly Honesty-Humility, showed stable and consistent effects between humans and AI, others displayed divergent or even opposite patterns, highlighting fundamental differences in their respective moral reasoning mechanisms.
The study reached three primary conclusions. First, LLMs are capable of exhibiting stable and distinguishable personality tendencies that can be activated through prompt-based alignment. Second, the influence of Honesty-Humility on moral judgment exhibits a consistent effect across humans and different LLMs, whereas other personality domains show inconsistencies. This suggests that while LLMs' moral decision-making shares partial cognitive logic with humans, fundamental differences remain. Third, the personality metatrait of “Stability”—and particularly the Honesty-Humility domain—demonstrates a significant moral salience effect within the personalized alignment process. Based on these insights, this research proposes a personalized alignment framework utilizing the HEXACO model and personality metatrait theory to systematically shape the moral responses of AI, providing a psychological foundation for the development of safety, controllable and ethical AI systems. This framework emphasizes integrating psychological theories to mitigate ethical risks and ensure that AI behavior remains consistent with human values.

Key words: large language models, personalized alignment, moral judgment, HEXACO personality, metatrait