ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

心理科学进展 ›› 2021, Vol. 29 ›› Issue (9): 1696-1710.doi: 10.3724/SP.J.1042.2021.01696

• 研究方法 • 上一篇    

心理与教育测验中异常作答处理的新技术: 混合模型方法

刘玥1, 刘红云2,3()   

  1. 1四川师范大学脑与心理科学研究院, 成都 610066
    2应用实验心理北京市重点实验室
    3北京师范大学心理学部, 北京 100875
  • 收稿日期:2020-10-23 发布日期:2021-07-22
  • 通讯作者: 刘红云 E-mail:hyliu@bnu.edu.cn
  • 基金资助:
    国家自然科学基金项目(32071091)

Mixture Model Method: A new method to handle aberrant responses in psychological and educational testing

LIU Yue1, LIU Hongyun2,3()   

  1. 1Institute of Brain and Psychological Sciences, Sichuan Normal University, Chengdu 610066, China
    2Beijing Key Laboratory of Applied Experimental Psychology, Beijing Normal University, Beijing 100875, China
    3Faculty of Psychology, Beijing Normal University, Beijing 100875, China
  • Received:2020-10-23 Published:2021-07-22
  • Contact: LIU Hongyun E-mail:hyliu@bnu.edu.cn

摘要:

混合模型方法(Mixture Model Method)是近年来提出的, 对心理与教育测验中的异常作答进行处理的方法。与反应时阈值法, 反应时残差法等传统方法相比, 混合模型方法可以同时完成异常作答的识别和模型参数估计, 并且, 在数据污染严重的情况下仍具有较好的表现。该方法的原理为根据正常作答和异常作答的特点, 针对分类潜变量(即作答层面的分类)的不同类别, 在作答反应和(或)反应时部分建立不同的模型, 从而实现对分类潜变量, 以及模型中其他题目和被试参数的估计。文章详细介绍了目前提出的几种混合模型方法, 并将其与传统方法比较分析。未来研究可在模型前提假设违背, 含有多种异常作答等情况下探索混合模型方法的稳健性和适用性, 通过固定部分题目参数, 增加选择流程等方式提高混合模型方法的使用效率。

关键词: 异常作答, 反应时, 阈值, 残差法, 混合模型

Abstract:

Aberrant responses have been repeatedly reported in psychological and educational measurement. If traditional measurement models or methods (e.g., item response theory, IRT) are applied to data sets contaminated by aberrant responses, parameter estimates may be biased. Therefore, it is necessary to identify aberrant responses and to reduce their detrimental effects.

In the literature, there are two traditional response time (RT)-based methods to detect aberrant responses: RT threshold method and RT residual method. The focus of these methods is to find a threshold of RT or RT residual. If a RT or RT residual is remarkably less than the threshold, this response should be regarded as an aberrant response with extremely short RT (e.g., speededness, rapid-guessing), and consequently does not provide information about the test taker’s latent trait. Afterwards, down-weighting strategy, which tries to limit the influence of aberrant responses on parameter estimation by reducing their weight in the sample, can be applied.

The mixture model method (MMM), is a new method proposed to handle data contaminated by aberrant responses. This method applies the accommodating strategy, which is to extend a model in order to account for the contaminations directly. MMM shows more advantages in terms of: (1) detecting aberrant responses and obtaining parameter estimates simultaneously, instead of two steps (detecting and down-weighting); (2) precisely recovering the severity of aberrant responding. There are two categories of MMM. The first category of methods assumes that the classification (i.e., whether the item is answered normally or aberrantly) can be predicted by RT. While the second category is a natural extension of van der Linden’s (2007) hierarchical model, which models responses and RTs jointly. In this method, the observed RT, as well as the correct response probability of each item-by-person encounter can be decomposed to RT (or probability) caused by normal response and that caused by aberrant response according to the most important difference between the two distinct behaviors. This method leads to more precisely estimated item and person parameters, as well as excellent classification of aberrant/normal behavior.

First, this article compares the basic logic of the two traditional RT-based methods and MMM. Aberrant responses are regarded as outliers in both RT threshold method and RT residual method. Therefore, they rely heavily on the severity of aberrance. If data set is contaminated by aberrant responses seriously, the observed RT (or RT residual) distribution will be different from the expected distribution, which in turn leads to low power and sometimes high false detection rate. On the other hand, MMM, which assumes that both observed RT and correct response probability follow a mixture distribution, treats aberrant and normal responses equally. In that way, it has little reliance on the severity of aberrance. In addition to that, MMM can apply to the situation when all the respondents actually respond regularly in theoretic. In that situation, all the responses are assumed to be classified into one category. Second, this article summarizes the disadvantages of the three methods. MMM has three primary limitations: (1) it usually relies heavily on strong assumptions, which means that it may not perform well if these assumptions are violated; (2) low proportion of aberrant response may lead to convergence problem and model identification problem; (3) it is quite complex and time-consuming. In all, practitioners should choose a proper method according to the characteristics of tests and categories of aberrant responses (e.g., rapid-guessing, item with preknowledge, cheating). In the end, this article suggests future researches may investigate the performance of MMM when its assumptions are violated or data consists of more types of aberrant response patterns. Fixing item parameter estimates, proposing some index to help choosing suitable methods, are encouraged to improve the efficiency of MMM.

Key words: aberrant responses, response time, threshold, residual method, mixture model

中图分类号: