ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2011, Vol. 43 ›› Issue (06): 710-724.

• • 上一篇    

认知诊断计算机化自适应测验中在线标定方法的开发

陈平;辛涛   

  1. 北京师范大学发展心理研究所, 北京 100875
  • 收稿日期:2010-11-16 修回日期:1900-01-01 发布日期:2011-06-30 出版日期:2011-06-30
  • 通讯作者: 辛涛

Developing On-line Calibration Methods for Cognitive Diagnostic Computerized Adaptive Testing

CHEN Ping;XIN Tao   

  1. Institute of Developmental Psychology, Beijing Normal University, Beijing 100875, China
  • Received:2010-11-16 Revised:1900-01-01 Online:2011-06-30 Published:2011-06-30
  • Contact: XIN Tao

摘要: 项目增补对认知诊断计算机化自适应测验(CD-CAT)中的题库维护至关重要。在传统CAT中, 在线标定方法经常用于估计新题的项目参数。然而直到现在, 在CD-CAT领域还没有任何关于在线标定的论文公开发表。为将传统CAT中3种有代表性的在线标定方法(Method A、OEM和 MEM)推广至CD-CAT (CD-Method A、CD-OEM和CD-MEM)建立分析基础, 并采用模拟方法对这3种方法进行比较。研究表明:CD-Method A方法在项目参数的返真性方面优于其它两种方法; 自适应标定设计较随机标定设计可以提高项目参数的返真质量。

关键词: 计算机化自适应测验, 认知诊断, 在线标定, 旧题, 新题

Abstract: Like all computerized adaptive testing (CAT) applications, some items in the item bank maybe flawed or obsolete or overexposed and they should be replaced by new items (Wainer & Mislevy, 1990), item replenishing is essential for item bank maintenance and development in cognitive diagnostic CAT (CD-CAT). In regular CAT, on-line calibration method is commonly used to calibrate the item parameters of new items. However, until now no reference is publicly available about on-line calibration for CD-CAT. Thus, this study investigated the possibility to extend some current methods used in CAT to CD-CAT situation. Three representative on-line calibration methods in regular CAT were under investigation: Method A (Stocking, 1988), marginal maximum likelihood estimate with one EM cycle (OEM) method (Wainer & Mislevy, 1990) and marginal maximum likelihood estimate with multiple EM cycles (MEM) method (Ban, Hanson, Wang, Yi, & Harris, 2001). Under certain theoretical justifications based on the Deterministic Inputs, Noisy “and” Gate (DINA) model, these methods were generalized to CD-CAT situation, denoted as CD-Method A, CD-OEM and CD-MEM, respectively.
Two simulation studies were conducted to compare the performance of the three CD-CAT on-line calibration methods in terms of item-parameter recovery. In the first study, the new items were randomly assigned to the examinees and then were calibrated accordingly. 2000 examinees were generated assuming that each examinee has 50% probability of mastering each attribute, 360 operational items were simulated and their guessing and slipping parameters were all randomly drawn from U (0.05, 0.25). 20 new items were simulated and the Q matrix corresponding to the new items was constructed by randomly selecting 20 rows from the Q matrix corresponding to the operational items, and the item parameters of new items were also randomly drawn from U (0.05, 0.25). The Shannon Entropy method was employed to select the next item and the Maximum A Posterior method was used to update the knowledge state (KS) estimates of examinees. As to the second study, the new items were first administered to a sub-group of the examinees and then were pre-calibrated; then for the remaining examinees, the new items were selected adaptively according to their initial parameter estimates to fit the examinee’s current KS estimates; finally, the item parameters of the new items were re-calibrated sequentially. Note that all the simulation conditions in the second study remained the same as those in the first study except the new items were adaptively selected.
The results of Study 1 indicated that CD-Method A outperformed the other two methods in that it yielded the smallest estimation errors, and the simulated CD-CAT test was able to provide relatively accurate KS estimates for the examinees. The results of Study 2 showed that the adaptive calibration design could improve the item-parameter recovery compared with the random calibration design for CD-Method A, CD-Method A and CD-OEM.
Though the results from the two studies are very encouraging, further studies are proposed for the future investigations such as different sample sizes, different cognitive diagnostic models and different attribute hierarchical structures.

Key words: computerized adaptive testing, cognitive diagnosis, on-line calibration, operational item, new item