允许CAT题目检查的区块题目袋方法

doi:10.3724/SP.J.1041.2015.01188

心理学报 ›› 2015, Vol. 47 ›› Issue (9): 1188-1198.doi: 10.3724/SP.J.1041.2015.01188

• 论文 • 上一篇

允许CAT题目检查的区块题目袋方法

林喆¹;陈平²;辛涛^1,2

(¹北京师范大学发展心理研究所, 北京 100875) (²中国基础教育质量监测协同创新中心, 北京 100875)

收稿日期:2014-11-06 发布日期:2015-09-25 出版日期:2015-09-25
通讯作者: 辛涛, E-mail: xintao@bnu.edu.cn
基金资助:
国家自然科学基金面上项目(31371047); 国家自然科学基金青年科学基金项目(31300862); 高等学校博士学科点专项科研基金项目(20130003120002); 中央高校基本科研业务费专项资金资助(2013YB26)。

The Block Item Pocket Method to Allow Item Review in CAT

LIN Zhe¹; CHEN Pin²; XIN Tao^1,2

(¹ Institute of Developmental Psychology, Beijing Normal University, Beijing 100875, China)
(² National Innovation Center for Assessment of Basic Education Quality, Beijing 100875, China)

Received:2014-11-06 Online:2015-09-25 Published:2015-09-25
Contact: XIN Tao, E-mail: xintao@bnu.edu.cn

摘要/Abstract

摘要：

允许题目检查能够促进计算机化自适应测验(CAT)在实际中的应用。在不影响能力估计精度和测验公平性的前提下, 允许CAT题目检查能够缓解考生考试焦虑, 减少无关因素引起的测量误差。区块题目袋方法是连续区块方法与题目袋方法的结合, 不仅能允许CAT题目检查, 还能够弥补题目袋方法的不足。研究结果表明：(1)合理作答策略下, 区块题目袋方法的估计精度在低能力水平上要优于题目袋方法; (2)在应对类似Wainer作答策略时, 区块题目袋方法的估计精度在所有能力水平上均优于题目袋方法。(3)随着区块数的增加, 区块题目袋方法的能力估计精度越接近无修改的基线水平。

关键词: 计算机化自适应测验, 题目检查, 题目袋, 题目修改, 区块题目袋

Abstract:

Most computerized adaptive testing (CAT) do not allow examinees to review items because it will drastically decrease measurement precision and bring about extra cheating strategies (Wainer, 1993; Wise, 1996). Allowing item review is essential to make CAT comparable with traditional tests. It also matters in application. Item review enables examinees to correct mistakes due to carelessness, which can further improve the precision of ability estimation. No such option may cause some negative consequences for their overall performance especially in high-stake examinations, such as tension or anxiety (Vispoel, Henderickson, & Bleiler, 2000). Therefore, it is worth trying if allowing item review could alleviate problems mentioned at the beginning (Wise, 1996; Vispoel, 2000, 2005).

Several methods have been proposed, including the successive block method (Stocking, 1997) and the item pocket (IP) method (Han, 2013). However, both methods are limited in some ways. Stocking’s method does not allow examinees to skip items and requires a large number of blocks which may bring about some extra adverse effects because of frequent decision to go to next block. Han’s method can avoid limitations of Stocking’s. But it requires an appropriate IP size and may result in high bias in large IP size situation. The present study proposed the block item pocket (BIP) method which sets fewer but larger blocks with a proper total IP size. This method keeps advantages of Stocking’s and Han’s and overcomes their disadvantages.

Two simulation studies of two response strategies were conducted to evaluate validity of the BIP method. Item parameters were randomly drawn from uniform distribution (b ~ U (-3, 3)) and (α ~ U (0, 2)). Each examinee was administered a fixed-length CAT with 30 items. The initial item for each examinee was randomly drawn from θ ~ U (-0.5, 0.5). For the CAT administration, the Maximum Fisher Information method was adopted to select items. The interim and final scores were estimated using MLE method in most conditions. When responses were less than 5 or when all answers were correct or wrong, EAP method was adopted. Each study contained five conditions: non-review, 1 blocks IP method, 2 blocks, 3 blocks and 6 blocks BIP method. Statistics like BIAS, MAE, and RMSE were used as evaluation criteria.

Results indicated that: (1) BIP method had better estimate precision than IP method at low ability level under normal strategy; (2) When dealing with Wainer-like strategy, BIP method was far more precise than item pocket method at all ability levels; (3) As the number of blocks increased, estimate precision got closer to non-review condition. Advantages of this new method and future directions were discussed.

Key words: computerized adaptive testing, item review, item pocket method, answer change, block item pocket method

林喆;陈平;辛涛. (2015). 允许CAT题目检查的区块题目袋方法. 心理学报, 47(9), 1188-1198.

LIN Zhe; CHEN Pin; XIN Tao. (2015). The Block Item Pocket Method to Allow Item Review in CAT. Acta Psychologica Sinica, 47(9), 1188-1198.

[1]	陈平. 两种新的计算机化自适应测验在线标定方法[J]. 心理学报, 2016, 48(9): 1184-1198.
[2]	郭磊; 郑蝉金; 边玉芳; 宋乃庆; 夏凌翔. 认知诊断计算机化自适应测验中新的选题策略：结合项目区分度指标[J]. 心理学报, 2016, 48(7): 903-914.
[3]	罗照盛;喻晓锋;高椿雷;李喻骏;彭亚风;王睿;王钰彤. 基于属性掌握概率的认知诊断计算机化自适应测验选题策略[J]. 心理学报, 2015, 47(5): 679-688.
[4]	郭磊;郑蝉金;边玉芳. 变长CD-CAT中的曝光控制与终止规则[J]. 心理学报, 2015, 47(1): 129-140.
[5]	郭磊;王卓然;王丰;边玉芳. 结合a分层的兼具项目曝光和广义测验重叠率控制的选题策略[J]. 心理学报, 2014, 46(5): 702-713.
[6]	毛秀珍;辛涛. 认知诊断CAT中具有非统计约束选题方法的比较[J]. 心理学报, 2014, 46(12): 1910-1922.
[7]	毛秀珍;辛涛. 认知诊断CAT中项目曝光控制方法的比较[J]. 心理学报, 2013, 45(6): 694-703.
[8]	罗芬,丁树良,王晓庆. 多级评分计算机化自适应测验动态综合选题策略[J]. 心理学报, 2012, 44(3): 400-412.
[9]	陈平,辛涛. 认知诊断计算机化自适应测验中的项目增补[J]. 心理学报, 2011, 43(07): 836-850.
[10]	陈平,辛涛. 认知诊断计算机化自适应测验中在线标定方法的开发[J]. 心理学报, 2011, 43(06): 710-724.
[11]	程小扬,丁树良,严深海,朱隆尹. 引入曝光因子的计算机化自适应测验选题策略[J]. 心理学报, 2011, 43(02): 203-212.
[12]	陈平,丁树良. 允许检查并修改答案的计算机化自适应测验[J]. 心理学报, 2008, 40(06): 737-747.
[13]	林海菁,丁树良. 具有认知诊断功能的计算机化自适应测验的研究与实现[J]. 心理学报, 2007, 39(04): 747-753.
[14]	戴海琦,陈德枝,丁树良,邓太萍. 多级评分题计算机自适应测验选题策略比较[J]. 心理学报, 2006, 38(05): 778-783.
[15]	陈平,丁树良,林海菁,周婕. 等级反应模型下计算机化自适应测验选题策略[J]. 心理学报, 2006, 38(03): 461-467.

允许CAT题目检查的区块题目袋方法

The Block Item Pocket Method to Allow Item Review in CAT

PDF (PC)

评审附件

可视化

English Version

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价