ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2006, Vol. 38 ›› Issue (05): 778-783.

• • 上一篇    下一篇

多级评分题计算机自适应测验选题策略比较

戴海琦;陈德枝;丁树良;邓太萍   

  1. 江西师范大学教育学院,南昌 330027
  • 收稿日期:2004-10-25 修回日期:1900-01-01 发布日期:2006-09-30 出版日期:2006-09-30
  • 通讯作者: 戴海琦

The Comparison Among Item Selection Strategies of CAT with Multiple-choice Items

Dai Haiqi,Chen Dezhi,Ding Shuliang,Deng Taiping   

  1. College of Education, Jiangxi Normal University, Nanchang 330027, China
  • Received:2004-10-25 Revised:1900-01-01 Online:2006-09-30 Published:2006-09-30
  • Contact: Dai Haiqi

摘要: 研究比较了多级评分题计算机化自适应测验五种选题策略的优劣。应用的IRT模型是Samejima的等级反应模型。参加比较的选题策略有难度均值与能力匹配法、难度中值与能力匹配法、信息量最大法和两种A分层法。比较指标采用了能力估计值返回真值偏差、能力估计标准差、人均用题数和试题调用次数标准差四个。研究采用蒙特卡罗模拟法,结果显示每种方法各有优劣,在分层得当情况下,A分层法(中)的综合效果最佳

关键词: 计算机化自适应测验, 选题策略, 等级反应模型

Abstract: The initial purpose of comparing item selection strategies was to increase the efficiency of tests. As studies continued, however, it was found that increasing the efficiency of item bank using was also an important goal of comparing item selection strategies. These two goals often conflicted. The key solution was to find a strategy with which both goals could be accomplished.
The item selection strategies for graded response model in this study included: the average of the difficulty orders matching with the ability (ADMA); the medium of the difficulty orders matching with the ability (MMDA); maximum information (MI); A stratified (average) (ASA); and A stratified (medium) (ASM). The evaluation indexes used for comparison included: the bias of ability estimates for the true (Bias); the standard error of ability estimates (Se); the average items which the examinees have administered (Aiea); the standard deviation of the frequency of items selected (Sdf); and sum of the indices weighted (Siw). Using the Monte Carlo simulation method, we obtained some data and computer iterated the data 20 times each under the conditions that the item difficulty parameters followed the normal distribution and even distribution. The results were as follows:

Table 1.
The result of item difficulty parameters following normal distribution
strategies Bias Se Aiea Sdf Siw
ADMAMMDAMI ASA ASM 0.2989 (5)0.2668 (2)0.2638 (1)0.2764 (4)0.2711 (3) 0.4218 (4)0.3541 (3)0.4361 (5)0.3313 (2)0.3181 (1) 26.7391 (4)27.5874 (5)22.6463 (1)22.9780 (2)23.0762 (3) 72.8872 (4)58.0541 (2)127.3360 (5)67.7299 (3)58.0313 (1) 3.28003.70763.18513. 75703.9545

Table 2.
The result of item difficulty parameters following even distribution
strategies Bias Se Aiea Sdf Siw
ADMAMMDAMI ASA ASM 0.2423 (3)0.2372 (2)0.1646 (1)0.2709 (4)0.2745 (5) 0.2857 (3)0.2850 (2)0.2022 (1)0.3220 (4)0.3260 (5) 29.6080(5)28.7977(4)24.1035(3)22.0724(1)22.2843(2) 38.2600(3)43.2547(4)141.7506(5)38.0857 (2)33.5079 (1) 3.00832.94463.15213.11543.2103

The results indicated that no matter difficulty parameters followed the normal distribution or even distribution. Every type of item selection strategies designed in this research had its strong and weak points. In general evaluation, under the condition that items were stratified appropriately, A stratified (medium) (ASM) had the best effect

Key words: CAT, item selection strategy, Graded Response Model

中图分类号: