ISSN 0439-755X
CN 11-1911/B

心理学报 ›› 2016, Vol. 48 ›› Issue (8): 1047-1056.doi: 10.3724/SP.J.1041.2016.01047

• 论文 • 上一篇    

四参数Logistic模型潜在特质参数的 Warm加权极大似然估计

孟祥斌1, 2 ;陶 剑2, 3 ;陈莎莉2   

  1. (1东北师范大学教育学部; 2东北师范大学数学与统计学院, 应用统计教育部重点实验室; 3中国基础教育质量监测协同创新中心东北师范大学分中心, 长春 130024)
  • 收稿日期:2015-10-31 发布日期:2016-08-25 出版日期:2016-08-25
  • 通讯作者: 陶剑, E-mail:
  • 基金资助:

    国家自然科学基金项目(11501094, 11571069), 中国基础教育质量监测协同创新中心自主课题项目, 应用统计教育部重点实验室开放课题(230026510), 东北师范大学哲学社会科学校内青年基金项目(中央高校基本科研业务费专项资金资助, 1409124)。

Warm’sweighted maximum likelihood estimation of latent trait in the four-parameter logistic model

MENG Xiangbin1,2; TAO Jian2,3; CHEN Shali2   

  1. (1 Faculty of Education, Northeast Normal University, Changchun 130024, China) (2 KLAS, School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China) (3 Northeast Normal University Branch, Collaborative Innovation Center of Assessment toward Basic Education Quality, Changchun 130024, China)
  • Received:2015-10-31 Online:2016-08-25 Published:2016-08-25
  • Contact: TAO Jian, E-mail:


本文以四参数Logistic (4-parameter Logistic, 4PL)模型为研究对象, 根据Warm的加权极大似然估计技巧, 提出了4PL模型潜在特质参数的加权极大似然估计方法, 并借助模拟研究对加权极大似然估计的性质进行验证。研究结果表明, 与通常的极大似然估计和后验期望估计相比, 加权极大似然估计的偏差(bias)明显减小, 并且具有良好的返真性能。此外, 在测试的长度较短和项目的区分度较小的情况下, 加权极大似然估计依然保持了良好的统计性质, 表现出更加显著的优势。

关键词: 项目反应理论, 四参数Logistic模型, 加权极大似然估计


There are two types of aberrant responses, the correct responses resulting from lucky guesses, and the false responses resulting fromcarelessness. Because the two aberrant responses do not reflect the examinee’s actual knowledge, they may cause an erroneous estimation of the latent trait of examinee.Compared with guesses, careless errors might cause more serious estimation biases, especially if these errors occur at the beginning of a test. To account for the effect of careless errors, Barton and Lord (1981) developed a four-parameter logistic (4PL) model by adding an upper asymptote parameter in the three-parameter logistic (3PL) model. Recently, the 4PLmodel received more attentions, and some literatures highlighted its potential and usefulness both from a methodological point of view and for practical purposes. It can be expected that the 4PL model will be promoted as a competing item response model in psychological and educational measurement. This paper focuses on one important aspect of the 4PL model, that is, the estimation of latent trait levels. In general, unbiased parameter estimation is desirable. Reducing bias in the latent trait estimator is very important for the application of IRT model. Warm (1989) proposed a weighted maximum likelihood (WML) method for estimating the latent trait parameter in the 3PL model, which was found to be less bias than the maximum likelihood (ML) and expected a posteriori (EAP) estimates. The WML estimate has also been extended to the generalized partial credit model (GPCM). In light of the superior performance of the WML method in previous studies, this studyapplies a WML latent trait estimator to the 4PL model. The main works of this article are to present the derivations of the WML estimator under the 4PL model, and to construct a simulation study to compare the properties of the WML estimator to that of the ML and EAP estimators. The results of the simulation study suggested that, the bias of the WML estimator was consistently smaller than that of the ML and EAP estimators, particularly, the accuracy of the WML estimator was superior to that of the ML estimator and nearly equivalent to the EAPE. The difference in bias (and accuracy)of the three estimators was substantial when the latent trait is far away from the location of test, but was negligible when the latent trait matches the location of test. Furthermore, both the test length and the item discriminationhad a greater impacton the performanceof the ML and EAP estimatorsthan that of the WML estimator. In the relatively short tests of low discriminating items, the EAP estimator displayed grossly inflated levels of bias, the ML estimator displayed the largest decrease in accuracy, but theWML estimator performed more robustly. In general, the WML estimator maintains better properties than both the ML and EAP estimators, especially under conditions thatthe test information function was relatively small. Such conditions include, but are not limited to:(a) the mismatch between the latent trait and the location of test; (b) the shortness of the tests (e.g., n ≤12); and (c) the low-discrimination ofitems. In our paper, the findings are not extended to the framework of computer adaptive testing (CAT), asthe simulation was conducted under the linear testing. As a result, our research may be of greatvalue to test developers concerned with constructing fixed and non-adaptive tests.

Key words: item response theory, four-parameter logistic model, Warm’s weighted maximum likelihood estimation