Please wait a minute...
Acta Psychologica Sinica
Warm’sweighted maximum likelihood estimation of latent trait in the four-parameter logistic model
MENG Xiangbin1,2; TAO Jian2,3; CHEN Shali2
(1 Faculty of Education, Northeast Normal University, Changchun 130024, China) (2 KLAS, School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China) (3 Northeast Normal University Branch, Collaborative Innovation Center of Assessment toward Basic Education Quality, Changchun 130024, China)
Download: PDF(629 KB)   Review File (1 KB) 
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    

There are two types of aberrant responses, the correct responses resulting from lucky guesses, and the false responses resulting fromcarelessness. Because the two aberrant responses do not reflect the examinee’s actual knowledge, they may cause an erroneous estimation of the latent trait of examinee.Compared with guesses, careless errors might cause more serious estimation biases, especially if these errors occur at the beginning of a test. To account for the effect of careless errors, Barton and Lord (1981) developed a four-parameter logistic (4PL) model by adding an upper asymptote parameter in the three-parameter logistic (3PL) model. Recently, the 4PLmodel received more attentions, and some literatures highlighted its potential and usefulness both from a methodological point of view and for practical purposes. It can be expected that the 4PL model will be promoted as a competing item response model in psychological and educational measurement. This paper focuses on one important aspect of the 4PL model, that is, the estimation of latent trait levels. In general, unbiased parameter estimation is desirable. Reducing bias in the latent trait estimator is very important for the application of IRT model. Warm (1989) proposed a weighted maximum likelihood (WML) method for estimating the latent trait parameter in the 3PL model, which was found to be less bias than the maximum likelihood (ML) and expected a posteriori (EAP) estimates. The WML estimate has also been extended to the generalized partial credit model (GPCM). In light of the superior performance of the WML method in previous studies, this studyapplies a WML latent trait estimator to the 4PL model. The main works of this article are to present the derivations of the WML estimator under the 4PL model, and to construct a simulation study to compare the properties of the WML estimator to that of the ML and EAP estimators. The results of the simulation study suggested that, the bias of the WML estimator was consistently smaller than that of the ML and EAP estimators, particularly, the accuracy of the WML estimator was superior to that of the ML estimator and nearly equivalent to the EAPE. The difference in bias (and accuracy)of the three estimators was substantial when the latent trait is far away from the location of test, but was negligible when the latent trait matches the location of test. Furthermore, both the test length and the item discriminationhad a greater impacton the performanceof the ML and EAP estimatorsthan that of the WML estimator. In the relatively short tests of low discriminating items, the EAP estimator displayed grossly inflated levels of bias, the ML estimator displayed the largest decrease in accuracy, but theWML estimator performed more robustly. In general, the WML estimator maintains better properties than both the ML and EAP estimators, especially under conditions thatthe test information function was relatively small. Such conditions include, but are not limited to:(a) the mismatch between the latent trait and the location of test; (b) the shortness of the tests (e.g., n ≤12); and (c) the low-discrimination ofitems. In our paper, the findings are not extended to the framework of computer adaptive testing (CAT), asthe simulation was conducted under the linear testing. As a result, our research may be of greatvalue to test developers concerned with constructing fixed and non-adaptive tests.

Keywords item response theory      four-parameter logistic model      Warm’s weighted maximum likelihood estimation     
Corresponding Authors: TAO Jian, E-mail:   
Issue Date: 25 August 2016
E-mail this article
E-mail Alert
Articles by authors
MENG Xiangbin
TAO Jian
CHEN Shali
Cite this article:   
MENG Xiangbin,TAO Jian,CHEN Shali. Warm’sweighted maximum likelihood estimation of latent trait in the four-parameter logistic model[J]. Acta Psychologica Sinica, 10.3724/SP.J.1041.2016.01047
URL:     OR
[1] CHEN Ping. Two new online calibration methods for computerized adaptive testing[J]. Acta Psychologica Sinica, 2016, 48(9): 1184-1198.
[2] WANG Wenyi;SONG Lihong;DING Shuliang. Classification accuracy and consistency indices for complex decision rules in multidimensional item response theory[J]. Acta Psychologica Sinica, 2016, 48(12): 1612-1624.
[3] ZHAN Peida; CHEN Ping; BIAN Yufang. Using confirmatory compensatory multidimensional IRT models to do cognitive diagnosis[J]. Acta Psychologica Sinica, 2016, 48(10): 1347-1356.
[4] ZHAN Peida; LI Xiaomin; WANG Wen-Chung; BIAN Yufang; WANG Lijun. The Multidimensional Testlet-Effect Cognitive Diagnostic Models[J]. Acta Psychologica Sinica, 2015, 47(5): 689-701.
[5] YAO Ruosong;ZHAO Baonan;LIU Ze;MIAO Qunying. The Application of Many-Facet Rasch Model in Leaderless Group Discussion[J]. Acta Psychologica Sinica, 2013, 45(9): 1039-1049.
[6] LIU Yue;LIU Hongyun. Comparison of MIRT Linking Methods for Different Common Item Designs[J]. Acta Psychologica Sinica, 2013, 45(4): 466-480 .
[7] DU Wenjiu;ZHOU Juan;LI Hongbo. The Item Parameters’ Estimation Accuracy of Two-Parameter Logistic Model[J]. Acta Psychologica Sinica, 2013, 45(10): 1179-1186.
[8] LIU Hong-Yun,LI Chong,ZHANG Ping-Ping,LUO Fang. Testing Measurement Equivalence of Categorical Items’ Threshold/Difficulty Parameters: A Comparison of CCFA and (M)IRT Approaches[J]. Acta Psychologica Sinica, 2012, 44(8): 1124-1136.
[9] LIU Hong-Yun,LUO Fang,WANG Yue,ZHANG Yu. Item Parameter Estimation for Multidimensional Measurement: Comparisons of SEM and MIRT Based Methods[J]. , 2012, 44(1): 121-132.
[10] TU Dong-Bo,CAI Yan,DAI Hai-Qi,DING Shu-Liang. Parameters Estimation of MIRT Model and Its Application in Psychological Tests[J]. , 2011, 43(11): 1329-1340.
[11] WU Rui,DING Shu-Liang,GAN Deng-Wen. Test Equating with Testlets[J]. , 2010, 42(03): 434-442.
[12] LUO Zhao-Sheng, OUYANG Xue-Lian, QI Shu-Qing, DAI Hai-Qi,,DING Shu-Liang. IRT Information Function of Polytomously Scored Items under the Graded Response Model[J]. , 2008, 40(11): 1212-1220.
[13] CAO Yi-Wei,MAO Cheng-Mei. Adjustment of Freshman College Students:
A longitudinal Study using Longitudinal Rasch Model
[J]. , 2008, 40(04): 427-435.
[14] LIU Hong-Yun,LUO Fang. The Use of Multilevel Item Response Theory Modeling in Test Development[J]. , 2008, 40(01): 92-100.
[15] Xiao Wei,Miao Danmin,Zhu Ningning,Zhang Qinghua. The Development of the Item Bank of Graphic Deductive Test
Based on Item Response Theory
[J]. , 2006, 38(06): 934-940.
Full text



Copyright © Acta Psychologica Sinica
Support by Beijing Magtech