Please wait a minute...
心理学报
  论文 本期目录 | 过刊浏览 | 高级检索 |
认知诊断计算机化自适应测验中新的选题策略:结合项目区分度指标
郭磊1,2,3; 郑蝉金4; 边玉芳5; 宋乃庆3,6; 夏凌翔1
(1西南大学心理学部, 重庆 400715) (2西南大学统计学博士后科研流动站, 重庆 400715) (3中国基础教育质量监测协同创新中心西南大学分中心, 重庆 400715) (4江西师范大学心理学院, 南昌 330022) (5北京师范大学中国基础教育质量监测协同创新中心, 北京 100875) (6西南大学基础教育研究中心, 重庆 400715)
New item selection methods in cognitive diagnostic computerized adaptive testing: Combining item discrimination indices
GUO Lei1,2,3; ZHENG Chanjin4; BIAN Yufang5; SONG Naiqing3,6; XIA Lingxiang1
(1 Faculty of Psychology, Southwest University, Chongqing 400715, China) (2 Postdoctoral Research Center for Statistics, Southwest University, Chongqing 400715, China) (3 Southwest University Branch, Collaborative Innovation Center of Assessment toward Basic Education Quality, Chongqing 400715, China) (4 School of Psychology, Jiangxi Normal University, Nanchang 330022, China) (5 Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, Beijing 100875, China) (6 Center for Basic Education Research, Southwest University, Chongqing 400715, China)
全文: PDF(435 KB)   评审附件 (1 KB) 
输出: BibTeX | EndNote (RIS)      
摘要 

当前国内外大部分认知诊断计算机化自适应测验(CD-CAT)主要采用PWKL作为选题策略进行研究。PWKL结合后验分布信息对KL指标进行加权, 提高了判准率, 但该方法仅利用个体层面信息加权, 忽视了项目本身能够提供的信息, 属于单源指标。本研究结合认知诊断中的项目区分度信息, 对PWKL进行修正, 提出了4种新的多源选题策略:GIDPWKL、AIDPWKL、CIDPWKL和KLEDPWKL方法, 并在加入曝光控制下与PWKL和互信息法(MIM)进行比较。模拟研究结果表明:(1)在定长测验情景下的绝大多数实验结果表明, 测验长度越短, 新方法的判准率越高。平均属性/模式判准率最高的是GIDPWKL, 之后是AIDPWKL, 而CIDPWKL、KLEDPWKL和MIM方法的优势随实验条件不同而不同。(2)在定长测验情景下的绝大多数实验结果表明, 题目质量越高, 新方法的优势越明显。(3)Q矩阵结构的复杂性会影响不同选题策略的表现。(4)在变长测验情景下, 4种新方法和MIM的平均测验长度均要低于PWKL方法, 表现最好的是GIDPWKL方法。因此, 若实际测验情景与本研究的模拟情景相似, 推荐GIDPWKL方法。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
郭磊
郑蝉金
边玉芳
宋乃庆
夏凌翔
关键词 认知诊断计算机化自适应测验 选题策略 项目区分度 曝光控制    
Abstract

Interest in developing computerized adaptive testing (CAT) under cognitive diagnostic models has increased recently. Cognitive diagnostic computerized adaptive testing (CD-CAT) attempt to classify examinees into the correct latent class profile so as to pinpoint the strengths and weaknesses of each examinee whereas CAT algorithms choose items from the item bank to achieve that goal as efficiently as possible. Most of the research in CD-CAT uses the posterior-weighted Kullback-Leibler (PWKL) index due to its high efficiency. The PWKL index integrated the posterior probabilities of examinees’ latent class profiles into the KL information, and thus improved item selection efficiency considerably. However, the PWKL index only used examinee-based information to assess the relative importance of each latent class profile. The current study attempted to take advantage of not only the examinee-base information but also the item-based information that could be readily obtained from items. In a sense, the PWKL index should be regarded as single-source index. This paper introduced four new multiple-source item selection methods, GIDPWKL, AIDPWKL, CIDPWKL, and KLEDPWKL respectively, which can be modified from the PWKL index by combining the item discrimination information. Two simulation studies were conducted to evaluate the new methods’ efficiency against the PWKL index and mutual information (MI) index in the DINA model with the exposure control. The effects of different factors were investigated: the Q matrix structure (simple vs. complex), item quality (high vs. low) and test length (moderate vs. short). Simulation results indicated that: (1) In most cases, the shorter the test length was, the higher AACCR and PCCR values the four new methods would have in the fix-length test. The GIDPWKL index had the highest average attribute correct classification rate and pattern correct classification rate among the six methods, and followed by AIDPWKL index. The performance among the CIDPWKL, KLEDPWKL, and MI depends on the experimental conditions. (2) In most cases, the higher the item quality was, the more advantage the four new methods would have in the fix-length test. (3) The structure of the Q matrix affected the performance of different item selection methods. (4) In the variable-length test, the mean of test length across all examinees for the four new methods and MI method were all smaller than those in the PWKL method. As a whole, the performance of the GIDPWKL index was the best, and should be recommended in practice where had the similar testing scenarios.

Key wordscognitive diagnostic computerized adaptive testing    item selection strategy    item discrimination    exposure control
收稿日期: 2014-10-08      出版日期: 2016-07-25
基金资助:

中央高校基本科研业务费专项资金资助, 项目批准号: SWU1409433。教育部人文社会科学研究青年基金项目, 项目批准号: 15YJC190003。自立人格与社区心理(PI)研究室科研基金资助。

通讯作者: 郭磊, E-mail: happygl1229@swu.edu.cn   
引用本文:   
郭磊; 郑蝉金; 边玉芳; 宋乃庆; 夏凌翔. 认知诊断计算机化自适应测验中新的选题策略:结合项目区分度指标[J]. 心理学报, 10.3724/SP.J.1041.2016.00903.
GUO Lei; ZHENG Chanjin; BIAN Yufang; SONG Naiqing; XIA Lingxiang. New item selection methods in cognitive diagnostic computerized adaptive testing: Combining item discrimination indices. Acta Psychologica Sinica, 2016, 48(7): 903-914.
链接本文:  
http://journal.psych.ac.cn/xlxb/CN/10.3724/SP.J.1041.2016.00903      或      http://journal.psych.ac.cn/xlxb/CN/Y2016/V48/I7/903
[1] 罗照盛;喻晓锋;高椿雷;李喻骏;彭亚风;王 睿;王钰彤. 基于属性掌握概率的认知诊断计算机化自适应测验选题策略[J]. 心理学报, 2015, 47(5): 679-688.
[2] 郭磊;郑蝉金;边玉芳. 变长CD-CAT中的曝光控制与终止规则[J]. 心理学报, 2015, 47(1): 129-140.
[3] 郭磊;王卓然;王丰;边玉芳. 结合a分层的兼具项目曝光和广义测验重叠率控制的选题策略[J]. 心理学报, 2014, 46(5): 702-713.
[4] 毛秀珍;辛涛. 认知诊断CAT中项目曝光控制方法的比较[J]. 心理学报, 2013, 45(6): 694-703.
[5] 罗芬,丁树良,王晓庆. 多级评分计算机化自适应测验动态综合选题策略[J]. , 2012, 44(3): 400-412.
[6] 陈平,辛涛. 认知诊断计算机化自适应测验中的项目增补[J]. , 2011, 43(07): 836-850.
[7] 程小扬,丁树良,严深海,朱隆尹. 引入曝光因子的计算机化自适应测验选题策略[J]. , 2011, 43(02): 203-212.
[8] 刘珍,丁树良,林海菁. 基于GPCM的计算机自适应测验选题策略比较[J]. , 2008, 40(05): 618-625.
[9] 林海菁,丁树良. 具有认知诊断功能的计算机化自适应测验的研究与实现[J]. , 2007, 39(04): 747-753.
[10] 戴海琦,陈德枝,丁树良,邓太萍. 多级评分题计算机自适应测验选题策略比较[J]. , 2006, 38(05): 778-783.
[11] 陈平,丁树良,林海菁,周婕. 等级反应模型下计算机化自适应测验选题策略[J]. , 2006, 38(03): 461-467.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《心理学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发  技术支持:support@magtech.com.cn