四参数Logistic模型潜在特质参数的 Warm加权极大似然估计

doi:10.3724/SP.J.1041.2016.01047

心理学报 ›› 2016, Vol. 48 ›› Issue (8): 1047-1056.doi: 10.3724/SP.J.1041.2016.01047

• 论文 • 上一篇

四参数Logistic模型潜在特质参数的 Warm加权极大似然估计

孟祥斌^{1, 2} ;陶剑^{2, 3} ;陈莎莉²

(¹东北师范大学教育学部; ²东北师范大学数学与统计学院, 应用统计教育部重点实验室;³中国基础教育质量监测协同创新中心东北师范大学分中心, 长春 130024)

收稿日期:2015-10-31 发布日期:2016-08-25 出版日期:2016-08-25
通讯作者: 陶剑, E-mail: taoj@nenu.edu.cn
基金资助:
国家自然科学基金项目(11501094, 11571069), 中国基础教育质量监测协同创新中心自主课题项目, 应用统计教育部重点实验室开放课题(230026510), 东北师范大学哲学社会科学校内青年基金项目(中央高校基本科研业务费专项资金资助, 1409124)。

Warm’sweighted maximum likelihood estimation of latent trait in the four-parameter logistic model

MENG Xiangbin^1,2; TAO Jian^2,3; CHEN Shali²

(¹ Faculty of Education, Northeast Normal University, Changchun 130024, China) (² KLAS, School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China) (³ Northeast Normal University Branch, Collaborative Innovation Center of Assessment toward Basic Education Quality, Changchun 130024, China)

Received:2015-10-31 Online:2016-08-25 Published:2016-08-25
Contact: TAO Jian, E-mail: taoj@nenu.edu.cn

摘要/Abstract

摘要：

本文以四参数Logistic (4-parameter Logistic, 4PL)模型为研究对象, 根据Warm的加权极大似然估计技巧, 提出了4PL模型潜在特质参数的加权极大似然估计方法, 并借助模拟研究对加权极大似然估计的性质进行验证。研究结果表明, 与通常的极大似然估计和后验期望估计相比, 加权极大似然估计的偏差(bias)明显减小, 并且具有良好的返真性能。此外, 在测试的长度较短和项目的区分度较小的情况下, 加权极大似然估计依然保持了良好的统计性质, 表现出更加显著的优势。

关键词: 项目反应理论, 四参数Logistic模型, 加权极大似然估计

Abstract:

There are two types of aberrant responses, the correct responses resulting from lucky guesses, and the false responses resulting fromcarelessness. Because the two aberrant responses do not reflect the examinee’s actual knowledge, they may cause an erroneous estimation of the latent trait of examinee.Compared with guesses, careless errors might cause more serious estimation biases, especially if these errors occur at the beginning of a test. To account for the effect of careless errors, Barton and Lord (1981) developed a four-parameter logistic (4PL) model by adding an upper asymptote parameter in the three-parameter logistic (3PL) model. Recently, the 4PLmodel received more attentions, and some literatures highlighted its potential and usefulness both from a methodological point of view and for practical purposes. It can be expected that the 4PL model will be promoted as a competing item response model in psychological and educational measurement. This paper focuses on one important aspect of the 4PL model, that is, the estimation of latent trait levels. In general, unbiased parameter estimation is desirable. Reducing bias in the latent trait estimator is very important for the application of IRT model. Warm (1989) proposed a weighted maximum likelihood (WML) method for estimating the latent trait parameter in the 3PL model, which was found to be less bias than the maximum likelihood (ML) and expected a posteriori (EAP) estimates. The WML estimate has also been extended to the generalized partial credit model (GPCM). In light of the superior performance of the WML method in previous studies, this studyapplies a WML latent trait estimator to the 4PL model. The main works of this article are to present the derivations of the WML estimator under the 4PL model, and to construct a simulation study to compare the properties of the WML estimator to that of the ML and EAP estimators. The results of the simulation study suggested that, the bias of the WML estimator was consistently smaller than that of the ML and EAP estimators, particularly, the accuracy of the WML estimator was superior to that of the ML estimator and nearly equivalent to the EAPE. The difference in bias (and accuracy)of the three estimators was substantial when the latent trait is far away from the location of test, but was negligible when the latent trait matches the location of test. Furthermore, both the test length and the item discriminationhad a greater impacton the performanceof the ML and EAP estimatorsthan that of the WML estimator. In the relatively short tests of low discriminating items, the EAP estimator displayed grossly inflated levels of bias, the ML estimator displayed the largest decrease in accuracy, but theWML estimator performed more robustly. In general, the WML estimator maintains better properties than both the ML and EAP estimators, especially under conditions thatthe test information function was relatively small. Such conditions include, but are not limited to:(a) the mismatch between the latent trait and the location of test; (b) the shortness of the tests (e.g., n ≤12); and (c) the low-discrimination ofitems. In our paper, the findings are not extended to the framework of computer adaptive testing (CAT), asthe simulation was conducted under the linear testing. As a result, our research may be of greatvalue to test developers concerned with constructing fixed and non-adaptive tests.

Key words: item response theory, four-parameter logistic model, Warm’s weighted maximum likelihood estimation

孟祥斌;陶剑;陈莎莉. (2016). 四参数Logistic模型潜在特质参数的 Warm加权极大似然估计. 心理学报, 48(8), 1047-1056.

MENG Xiangbin; TAO Jian; CHEN Shali. (2016). Warm’sweighted maximum likelihood estimation of latent trait in the four-parameter logistic model. Acta Psychologica Sinica, 48(8), 1047-1056.

[1]	付颜斌, 陈琦鹏, 詹沛达. 问题解决任务中行动序列的二分类建模：单/两参数行动序列模型[J]. 心理学报, 2023, 55(8): 1383-1396.
[2]	童昊, 喻晓锋, 秦春影, 彭亚风, 钟小缘. 多级计分测验中基于残差统计量的被试拟合研究[J]. 心理学报, 2022, 54(9): 1122-1136.
[3]	任赫, 陈平. 两种新的多维计算机化分类测验终止规则[J]. 心理学报, 2021, 53(9): 1044-1058.
[4]	罗芬, 王晓庆, 蔡艳, 涂冬波. 基于基尼指数的双目标CD-CAT选题策略[J]. 心理学报, 2020, 52(12): 1452-1465.
[5]	陈平. 两种新的计算机化自适应测验在线标定方法[J]. 心理学报, 2016, 48(9): 1184-1198.
[6]	汪文义; 宋丽红;丁树良. 复杂决策规则下MIRT的分类准确性和分类一致性[J]. 心理学报, 2016, 48(12): 1612-1624.
[7]	詹沛达;陈平;边玉芳. 使用验证性补偿多维IRT模型进行认知诊断评估[J]. 心理学报, 2016, 48(10): 1347-1356.
[8]	詹沛达;李晓敏;王文中;边玉芳;王立君. 多维题组效应认知诊断模型[J]. 心理学报, 2015, 47(5): 689-701.
[9]	姚若松;赵葆楠;刘泽;苗群鹰. 无领导小组讨论的多侧面Rasch模型应用[J]. 心理学报, 2013, 45(9): 1039-1049.
[10]	杜文久;周娟;李洪波. 二参数逻辑斯蒂模型项目参数的估计精度[J]. 心理学报, 2013, 45(10): 1179-1186.
[11]	刘红云,李冲,张平平,骆方. 分类数据测量等价性检验方法及其比较：项目阈值(难度)参数的组间差异性检验[J]. 心理学报, 2012, 44(8): 1124-1136.
[12]	杜文久;肖涵敏. 多维项目反应理论等级反应模型[J]. 心理学报, 2012, 44(10): 1402-1407.
[13]	刘红云,骆方,王玥,张玉. 多维测验项目参数的估计：基于SEM与MIRT方法的比较[J]. 心理学报, 2012, 44(1): 121-132.
[14]	涂冬波,蔡艳,戴海琦,丁树良. 多维项目反应理论：参数估计及其在心理测验中的应用[J]. 心理学报, 2011, 43(11): 1329-1340.
[15]	吴,锐,丁树良,甘登文. 含题组的测验等值[J]. 心理学报, 2010, 42(03): 434-442.

四参数Logistic模型潜在特质参数的 Warm加权极大似然估计

Warm’sweighted maximum likelihood estimation of latent trait in the four-parameter logistic model

PDF (PC)

评审附件

可视化

English Version

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价