ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2011, Vol. 43 ›› Issue (07): 836-850.

• • 上一篇    

认知诊断计算机化自适应测验中的项目增补

陈平;辛涛   

  1. 北京师范大学发展心理研究所, 北京 100875
  • 收稿日期:2010-11-18 修回日期:1900-01-01 发布日期:2011-07-30 出版日期:2011-07-30
  • 通讯作者: 辛涛

Item Replenishing in Cognitive Diagnostic Computerized Adaptive Testing

CHEN Ping;XIN Tao   

  1. Institute of Developmental Psychology, Beijing Normal University, Beijing 100875, China
  • Received:2010-11-18 Revised:1900-01-01 Online:2011-07-30 Published:2011-07-30
  • Contact: XIN Tao

摘要: 项目的增补对认知诊断计算机化自适应测验(CD-CAT)题库的开发与维护至关重要。借鉴单维项目反应理论(IRT)中联合极大似然估计方法(JMLE)的思路, 提出联合估计算法(JEA), 仅依赖被试在旧题和新题上的作答反应联合地、自动地估计新题的属性向量和新题的项目参数。研究结果表明:当项目参数相对较小且样本量相对较大时, JEA算法在新题属性向量和新题项目参数估计精度方面表现不错; 而且样本大小、项目参数大小以及项目参数初值都影响着JEA算法的表现。

关键词: 认知诊断计算机化自适应测验, 项目增补, 在线标定, 属性自动标识, 新题

Abstract: Item replenishing is essential for item bank maintenance and development in cognitive diagnostic computerized adaptive testing (CD-CAT). Compared with item replenishing in regular CAT, item replenishing in CD-CAT is more complicated because it requires constructing the Q matrix (Embretson, 1984; Tatsuoka, 1995) corresponding to the new items (denoted as Qnew_item). However, the Qnew_item is often constructed manually by content experts and psychometricians, which brings about two issues: first, it takes experts a lot of time and efforts to discuss and complete the attribute identification task, especially when the number of new items is large; second, the Qnew_item identified by experts is not guaranteed to be totally correct because experts often disagree in the discussion. Therefore, this study borrowed the main idea of joint maximum likelihood estimation (JMLE) method in unidimensional item response theory (IRT) to propose the joint estimation algorithm (JEA), which depended fully on the examinees’ responses on the operational and new items to jointly estimate the Qnew_item and the item parameters of new items automatically in the context of CD-CAT under the Deterministic Inputs, Noisy “and” Gate (DINA) model.
A simulation study was conducted to investigate whether the JEA algorithm could accurately and efficiently estimate the Qnew_item and the item parameters of new items under different sample sizes and different levels of item parameter range, and the new items were randomly seeded in the random positions of examinees’ CD-CAT tests. In this study, four samples (sample sizes were 100, 300, 1000 and 3000 respectively) were simulated and each examinee had 50% probability of mastering each attribute. On the other hand, three item banks of 360 items were simulated and their item parameters were randomly drawn from U (0.05, 0.25), U (0.15, 0.35) and U (0.25, 0.45) respectively, and the three item banks shared the same Q matrix. 20 new items were simulated and the Qnew_item was constructed by randomly selecting 20 rows from the Q matrix, and the item parameters of new items were randomly drawn from U (0.05, 0.25) or U (0.15, 0.35) or U (0.25, 0.45) depending on the item parameter range of the corresponding operational items. The Shannon Entropy method was employed to select the next available item from the item bank, the Maximum A Posterior method was used to update the knowledge state estimates of examinees, and the fixed-length stopping rule was adopted and the test length was 20.
The results indicated that the JEA worked well in terms of the estimation accuracy of the Qnew_item and the item parameters of new items especially when the item parameter sizes were relatively small and the sample sizes were relatively large. And as the sample size increased, the estimation accuracy of attribute vectors were monotone increasing under all conditions, the calibration accuracy of the guessing and slipping parameters were monotone decreasing under most conditions. Also the sample size, item parameter size and initial item parameter all had effects on the performance of JEA.
Though the results from the simulation study are very encouraging, further studies are proposed for the future investigations such as different cognitive diagnostic models and different attribute hierarchical structures.

Key words: cognitive diagnostic computerized adaptive testing, item replenishing, on-line calibration, automatic attribute identification, new item