ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2006, Vol. 38 ›› Issue (03): 453-460.

• • 上一篇    下一篇

IRT中最小化χ2/EM参数估计方法

朱玮;丁树良;陈小攀   

  1. 深圳市档案局,深圳 518000

    江西师范大学信息与计算机工程学院,南昌 330027

  • 收稿日期:2005-08-04 修回日期:1900-01-01 出版日期:2006-05-30 发布日期:2006-05-30
  • 通讯作者: 丁树良

Minimum Chi-square/EM Estimation Under IRT

Zhu Wei,Ding-Shuliang,Chen Xiaopan   

  1. Shenzhen Archive Bureau, Shenzhen 518000, China

    Computer Information Engineering College of Jiangxi Normal University, Nanchang 330027, China

  • Received:2005-08-04 Revised:1900-01-01 Published:2006-05-30 Online:2006-05-30
  • Contact: Ding Shuliang

摘要: 对IRT的双参数Logistic模型(2PLM)中未知参数估计问题,给出了一个新的估计方法――最小化χ2/EM估计。新方法在充分考虑项目反应理论(IRT)与经典测量理论(CTT)之间的差异的前提下,从统计计算的角度改进了Berkson的最小化χ2估计,取消了Berkson实施最小化χ2估计时需要已知能力参数的不合实际的前提,扩大了应用范围。实验结果表明新方法能力参数的估计结果与BILOG相比,精确度要高,且当样本容量超过2000时,项目参数的估计结果也优于BILOG。实验还表明新方法稳健性好

关键词: 项目反应理论, 参数估计, EM算法

Abstract: A new parameter-estimation method, the minimum c2/EM algorithm for unknown parameters of the 2PLM, was proposed. The new estimation paradigm was based on careful considerations of the differences between item response theory (IRT) and classical test theory (CTT). Specifically, it is derived from a modified version of the minimum c2 algorithm originally proposed by Berkson (1955).
The starting point of the minimum c2 algorithm is the Pearson c2. Given ability score level, examinees can be classified into categories; the congruence of the sample and the expected distribution can be measured by c2 statistic. The subsequent estimation procedure is to seek appropriate item parameters to minimize c2. Because true ability scores are unobservable, most of the time, examinees are classified according to observed scores. We believe this practice is based on the point of view of CTT, which assumes that the examinees with the same observed scores have the same ability scores.
As we all know, the posterior distribution of ability parameter is affected by item parameters. Thus, the new method takes the posterior distribution of ability parameter into account and introduces artificial data in the EM algorithm for estimating the unknown parameters in IRT models. The new method redefines , (the observed proportion of correct responses and incorrect responses) of Berkson’s minimum c2 algorithm, and replaces it with artificial datum and respectively. The statistical reasoning and operations behind this method can be intuitively explained as the following:
In the minimum c2 algorithm, the observed proportion of responses is fixed and the theoretical distribution is changed with the new estimated value of the unknown parameters. In other words, the algorithm draws the theoretical distribution closer to the observed distribution and, as a consequence, the estimating speed slows down. In order to accelerate estimation, the new method connects artificial data to the item parameters through the EM algorithm so that the theoretical and the observed distribution represented by the artificial data can change simultaneously. Because item parameters are the so called “structural” parameters, whereas ability parameters of examinees are “incidental” parameters. In order to remove the effect of item parameters’ estimation to ability parameters, the new method also factored out ability parameters during estimation.
With the new estimation procedure, examinees can be classified just according to the posterior distribution of the ability parameter. After arriving at the final item parameters, examinee ability scores can be estimated using Bayesian EAP method. Through these procedures, the new method overcomes the restriction that the ability parameters must be known before estimation and expands the application range.
The results of a Monte Carlo simulation test demonstrated that the new method was not restricted by either the number of items or the number of examinees. It is also more effective and more robust than BILOG in terms of ability parameters recovery. When the number of examinees exceeded 2000, the new method was also much more effective than BILOG for item parameter recovery. The best advantage of the new method is that the ABS (the absolute value of the difference between the true and the estimated parameters) of item parameters were smaller than 0.08 when the number of examinee was 2000; the value decreased further with an increase in the number of examinees

Key words: IRT, parameters estimate, EM algorithm

中图分类号: