ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2013, Vol. 45 ›› Issue (6): 694-703.doi: 10.3724/SP.J.1041.2013.00694

• 论文 • 上一篇    下一篇

认知诊断CAT中项目曝光控制方法的比较

毛秀珍;辛涛   

  1. (1四川师范大学教育科学学院, 成都 610066) (2北京师范大学发展心理研究所, 北京 100875)
  • 收稿日期:2012-08-13 发布日期:2013-06-25 出版日期:2013-06-25
  • 通讯作者: 辛涛

A Comparison of Item Selection Methods for Controlling Exposure Rate in Cognitive Diagnostic Computerized Adaptive Testing

MAO Xiuzhen;XIN Tao   

  1. (1 College of Education, Sichuan Normal University, Chengdu, 610066, China) (2 Institute of Developmental Psychology, Beijing Normal University, Beijing, 100875, China)
  • Received:2012-08-13 Online:2013-06-25 Published:2013-06-25
  • Contact: XIN Tao

摘要: 项目曝光率关系到题库建设和测验安全, 是计算机化自适应测验(Computerized Adaptive Testing, CAT)需要考虑的重要问题。在认知诊断CAT情形下, 首先基于传统CAT中a-分层方法的思想提出按项目信息量对题库分层的分层多阶段(Stratified Multistage, SM)选题方法; 然后将SM方法与项目合格(Item Eligibility, IE)方法相结合得到SMIE方法。在此基础上, 开展模拟研究比较SM、IE、SMIE、最大修正优先指标 (Maximum Modified Priority Index, MMPI)方法、限制阈值(Restrictive Threshold, RT)方法和限制进度(Restrictive Progressive, RPG)方法的选题表现。总体上, 它们的测量精度从高到低依次为IE、SM、SMIE、RT、RPG和MMPI方法; 项目曝光分布均匀性的优劣次序为MMPI、RPG、SMIE、RT、SM和IE方法; SMIE和RT方法能较好地平衡测量精度和项目曝光均匀性要求。

关键词: 认知诊断计算机化自适应测验, 选题方法, 测量精度, 项目曝光率

Abstract: Item exposure rate is the utilization frequency of an item. When the exposure rate is high, examinees will likely share item content. If there are too many over-exposed items, test security and hence the validity of the assessment will certainly be compromised. Furthermore, with a lot of under-exposed items having low or zero item-exposure rates, the manpower and financial resources spent on item construction will be wasted and the item pool construction will become more challenging. Item exposure control is, therefore, an important issue in computerized adaptive testing (CAT). Cognitive diagnostic CAT (CD-CAT) combines and makes use of the strengths of cognitive diagnosis theory and CAT. The system will be able to provide information on the knowledge competence of the examinees by administering fewer items than traditional assessment. Based on the a-stratified method and the item eligibility method in regular CAT, the present study proposed and compared the performance of six techniques, namely, (a) the item eligibility (IE) method, (b) the stratified multistage (SM) approach, (c) the stratified multistage-item eligibility (SMIE) method, (d) the restrictive threshold (RT) method, (e) the maximum modified priority index (MMPI) method, and (f) the restrictive progress (RPG) method. With noting it that the SM approach is similar to the a-stratified method in item selection steps. The SM approach, however, different with the a-stratified method firstly in that it stratifies the remaining item pool based on the values of item information at the estimated attributed mastery pattern while the a-stratified method is based on the values of item discrimination parameter a. Secondly, in the SM method, the remainder item bank are stratified into a number of levels before the selection of each item, whereas in the a-stratified method, the item pool is stratified only once before the test and all the examinees have the same item strata. The SMIE method combines the SM and the IE method. MATLAB (R2010a) was used in the simulation experiments to write the CD-CAT code and the deterministic inputs, noisy “and” gate (DINA) model was applied in this study. Results showed that: (a) the SM method used in CD-CAT produced widely distributed item exposure by increasing the exposure rates of most items and fully utilizing the item pool but without greatly diminishing the maximum exposure rate and measurement accuracy; (b) other than a few items, the exposure rates of the IE method were lower than the setting maximum exposure rate, but most items still had extremely low exposure rates and hence resulting in a narrow distribution of item exposure and the highest measurement precision; (c) SMIE and RT methods behaved similarly in that not only could they increase the utilization frequency of the under-exposed items but they could also decrease the maximum exposure rate to a certain extent; (d) the MMPI and the RPG methods performed similarly with almost evenly distributed item exposure but at the great sacrifice of the measurement precision. As a whole, the performances of different methods in the order of their measurement accuracy are IE, SM, SMIE, RT, RPG and MMPI. The order in terms of their performances in exposure control is: MMPI, RPG, SMIE, RT, SM and IE. All in all, the SMIE and RT methods are able to balance measurement accuracy and item exposure well.

Key words: cognitive diagnostic computerized adaptive testing, measurement accuracy, item exposure control, item selection method