ISSN 1671-3710
CN 11-4766/R

Advances in Psychological Science ›› 2021, Vol. 29 ›› Issue (9): 1696-1710.doi: 10.3724/SP.J.1042.2021.01696

• Research Method • Previous Articles    

Mixture Model Method: A new method to handle aberrant responses in psychological and educational testing

LIU Yue1, LIU Hongyun2,3   

  1. 1Institute of Brain and Psychological Sciences, Sichuan Normal University, Chengdu 610066, China;
    2Beijing Key Laboratory of Applied Experimental Psychology, Beijing Normal University, Beijing 100875, China;
    3Faculty of Psychology, Beijing Normal University, Beijing 100875, China
  • Received:2020-10-23 Online:2021-09-15 Published:2021-07-22

Abstract: Aberrant responses have been repeatedly reported in psychological and educational measurement. If traditional measurement models or methods (e.g., item response theory, IRT) are applied to data sets contaminated by aberrant responses, parameter estimates may be biased. Therefore, it is necessary to identify aberrant responses and to reduce their detrimental effects.
In the literature, there are two traditional response time (RT)-based methods to detect aberrant responses: RT threshold method and RT residual method. The focus of these methods is to find a threshold of RT or RT residual. If a RT or RT residual is remarkably less than the threshold, this response should be regarded as an aberrant response with extremely short RT (e.g., speededness, rapid-guessing), and consequently does not provide information about the test taker's latent trait. Afterwards, down-weighting strategy, which tries to limit the influence of aberrant responses on parameter estimation by reducing their weight in the sample, can be applied.
The mixture model method (MMM), is a new method proposed to handle data contaminated by aberrant responses. This method applies the accommodating strategy, which is to extend a model in order to account for the contaminations directly. MMM shows more advantages in terms of: (1) detecting aberrant responses and obtaining parameter estimates simultaneously, instead of two steps (detecting and down-weighting); (2) precisely recovering the severity of aberrant responding. There are two categories of MMM. The first category of methods assumes that the classification (i.e., whether the item is answered normally or aberrantly) can be predicted by RT. While the second category is a natural extension of van der Linden's (2007) hierarchical model, which models responses and RTs jointly. In this method, the observed RT, as well as the correct response probability of each item-by-person encounter can be decomposed to RT (or probability) caused by normal response and that caused by aberrant response according to the most important difference between the two distinct behaviors. This method leads to more precisely estimated item and person parameters, as well as excellent classification of aberrant/normal behavior.
First, this article compares the basic logic of the two traditional RT-based methods and MMM. Aberrant responses are regarded as outliers in both RT threshold method and RT residual method. Therefore, they rely heavily on the severity of aberrance. If data set is contaminated by aberrant responses seriously, the observed RT (or RT residual) distribution will be different from the expected distribution, which in turn leads to low power and sometimes high false detection rate. On the other hand, MMM, which assumes that both observed RT and correct response probability follow a mixture distribution, treats aberrant and normal responses equally. In that way, it has little reliance on the severity of aberrance. In addition to that, MMM can apply to the situation when all the respondents actually respond regularly in theoretic. In that situation, all the responses are assumed to be classified into one category. Second, this article summarizes the disadvantages of the three methods. MMM has three primary limitations: (1) it usually relies heavily on strong assumptions, which means that it may not perform well if these assumptions are violated; (2) low proportion of aberrant response may lead to convergence problem and model identification problem; (3) it is quite complex and time-consuming. In all, practitioners should choose a proper method according to the characteristics of tests and categories of aberrant responses (e.g., rapid-guessing, item with preknowledge, cheating). In the end, this article suggests future researches may investigate the performance of MMM when its assumptions are violated or data consists of more types of aberrant response patterns. Fixing item parameter estimates, proposing some index to help choosing suitable methods, are encouraged to improve the efficiency of MMM.

Key words: aberrant responses, response time, threshold, residual method, mixture model

CLC Number: