ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

心理科学进展 ›› 2024, Vol. 32 ›› Issue (10): 1736-1756.doi: 10.3724/SP.J.1042.2024.01736

• 研究方法 • 上一篇    

认知建模中模型比较的方法

郭鸣谦1, 潘晚坷2, 胡传鹏2   

  1. 1拉德堡德大学行为科学学院, 奈梅亨 6525XZ, 荷兰;
    2南京师范大学心理学院, 南京 210097
  • 收稿日期:2023-06-25 出版日期:2024-10-15 发布日期:2024-08-13
  • 通讯作者: 郭鸣谦, E-mail: mqguo30@gmail.com; 胡传鹏, E-mail: hcp4715@hotmail.com

Model comparison in cognitive modeling

GUO Mingqian, PAN Wanke, HU Chuanpeng   

  1. 1Behavioral Science Institute, Radboud University, Nijmegen 6525XZ, the Netherlands;
    2School of Psychology, Nanjing Normal University, Nanjing 210097, China
  • Received:2023-06-25 Online:2024-10-15 Published:2024-08-13

摘要: 认知建模近年来在科学心理学获得广泛应用, 而模型比较是认知建模中关键的一环: 研究者需要通过模型比较选择出最优模型, 才能进行后续的假设检验或潜变量推断。模型比较不仅要考虑模型对数据的拟合(平衡过拟合与欠拟合), 也需要考虑模型的复杂度。然而, 模型比较指标众多, 纷繁复杂, 给研究者的选用带来困难。本文将认知建模常用的模型比较指标分为三大类并介绍其计算方法及优劣, 包括拟合优度指标(包括均方误差、决定系数、ROC曲线等)、基于交叉验证的指标(包括AIC、DIC等)和基于边际似然的指标。结合正交Go/No-Go范式的公开数据, 本文展示各指标在R语言中如何实现。在此基础上, 本文探讨各指标的适用情境及模型平均等新思路。

关键词: 认知建模, 计算模型, 模型选择, 模型比较

Abstract: Cognitive modeling has gained widespread application in psychological research, providing a robust framework for understanding complex cognitive processes. These models are instrumental in elucidating how mental functions such as memory, attention, and decision-making work. A critical aspect of cognitive modeling is model comparison, which involves selecting the most appropriate model for describing the behavior data and latent variable inference. The choice of the best model is crucial as it directly influences the validity and reliability of the research findings.
Selecting the best-fitting model often requires careful consideration. Researchers must balance the fit of the models to the data, ensuring that they avoid both overfitting and underfitting. Overfitting occurs when a model describes random error or noise instead of the underlying data structure, while underfitting happens when a model is too simplistic and fails to capture the data's complexity. Additionally, researchers must evaluate the complexity of the parameter data and the mathematical forms involved. This complexity can affect the model's interpretability and the ease with which it can be applied to new data sets.
This article categorizes and introduces three major classes of model comparison metrics commonly used in cognitive modeling: goodness-of-fit metrics, cross-validation-based metrics, and marginal likelihood-based metrics. Each class of metrics offers distinct advantages and is suitable for different types of data and research questions.
Goodness-of-fit metrics are straightforward and intuitive, providing a direct measure of how well a model fits the observed data. Examples include mean squared error (MSE), coefficient of determination (R²), and receiver operating characteristic (ROC) curves.
Cross-validation-based metrics provide a robust means of assessing model performance by partitioning the data into training and testing sets. This approach helps mitigate the risk of overfitting, as the model's performance is evaluated on unseen data. Common cross-validation metrics include the Akaike Information Criterion (AIC) and the Deviance Information Criterion (DIC).
Marginal likelihood-based metrics are grounded in Bayesian statistics and offer a probabilistic measure of model fit. These metrics evaluate the probability of the observed data given the model, integrating over all possible parameter values. This integration accounts for model uncertainty and complexity, providing a comprehensive measure of model performance. The marginal likelihood can be challenging to compute directly, but various approximations, such as the Bayesian Information Criterion (BIC) and Laplace approximation, are available.
The article delves into the computation methods and the pros and cons of each metric, providing practical implementations in R using data from the orthogonal Go/No-Go paradigm. This paradigm is commonly used in cognitive research to study motivation and reinforcement learning, making it an ideal example for illustrating model comparison techniques. By applying these metrics to real-world data, the article offers valuable insights into their practical utility and limitations.
Based on this foundation, the article identifies suitable contexts for each metric, helping researchers choose the most appropriate method for their specific needs. For instance, goodness-of-fit metrics are ideal for initial model evaluation and exploratory analysis, while cross-validation-based metrics are more suitable for model selection in predictive modeling. Marginal likelihood-based metrics, with their Bayesian underpinnings, are particularly useful in confirmatory analysis and complex hierarchical models.
The article also discusses new approaches such as model averaging, which combines multiple models to account for model uncertainty. Model averaging provides a weighted average of the predictions from different models, offering a more robust and reliable estimate than any single model. This approach can be particularly beneficial in complex cognitive modeling scenarios where multiple models may capture different aspects of the data.
In summary, this article provides a comprehensive overview of model comparison metrics in cognitive modeling, highlighting their computation methods, advantages, and practical applications. By offering detailed guidance on choosing and implementing these metrics, the article aims to enhance the rigor and robustness of cognitive modeling research.
Model comparison involves considering not only the fit of the models to the data (balancing overfitting and underfitting) but also the complexity of the parameter data and mathematical forms. This article categorizes and introduces three major classes of model comparison metrics commonly used in cognitive modeling, including: goodness-of-fit metrics (such as mean squared error, coefficient of determination, and ROC curves), cross-validation-based metrics (such as AIC, DIC), and marginal likelihood-based metrics. The computation methods and pros and cons of each metric are discussed, along with practical implementations in R using data from the orthogonal Go/No-Go paradigm. Based on this foundation, the article identifies the suitable contexts for each metric and discusses new approaches such as model averaging in model comparison.

Key words: cognitive modeling, computational models, model comparison, model selection