ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2008, Vol. 40 ›› Issue (06): 737-747.

• • 上一篇    下一篇

允许检查并修改答案的计算机化自适应测验

陈平;丁树良   

  1. 北京师范大学发展心理研究所,北京 100875

    江西师范大学计算机信息工程学院,南昌 330027

  • 收稿日期:2007-08-27 修回日期:1900-01-01 出版日期:2008-06-30 发布日期:2008-06-30
  • 通讯作者: 丁树良

Research on Computerized Adaptive Testing that Allows Reviewing and Changing Answers

CHEN Ping;DING Shu-Liang

  

  1. Developmental Psychology Institute,Beijing Normal University,Beijing 100875,China

    Computer Information Engineering College,Jiangxi Normal University,Nanchang 330027,China

  • Received:2007-08-27 Revised:1900-01-01 Published:2008-06-30 Online:2008-06-30
  • Contact: DING Shu-Liang

摘要: 采用计算机模拟程序对允许检查并修改答案的计算机化自适应测验(CAT)进行研究,并采用新的评分方式对付Wainer策略。结果表明:综合考虑被试的两次作答信息可以得到更精确的能力估计值。大部分被试进行了修改,只有少部分答案被修改,在被修改的答案中大部分是由错误改为正确;综合Wainer策略CAT的后验分布期望值(EAP)和极大似然估计值(MLE)可以“粗糙”对付Wainer策略

关键词: 计算机化自适应测验, 项目检查, Wainer策略, 蒙特卡洛模拟

Abstract: In the past decade, some paper-and-pencil (P&P) tests have been replaced by computerized adaptive testing (CAT) within many large-scale standardized testing programs. However, many researches and applications on CAT had limitations because most of the CAT did not allow examinees to review and change their answers. Among the operative CAT applications, there was only one that incorporated item review. Not allowing examinees to review and change their answers would result in test pressure and affect their performances. Moreover, a majority of examinees manifested a clear preference for item review because they believed that the inclusion of item review made the test fairer and considered it to be a disadvantage if review was disallowed. The CAT test organizers did not allow examinees to review and change answers mainly because they were apprehensive that examinees would use the deceptive Wainer strategy in the review stage to obtain positively biased ability estimates, consequently affecting the fairness and precision of the test. If we could provide a solution that not only allowed examinees to review and change answers but that was also able to deal with the Wainer strategy, the meaning would be great for the development of CAT. Until now, there have been few relevant studies on this topic worldwide. Moreover, the previous studies had a nonnegligible disadvantage, in that the researchers only recorded the answers of the review stage and used them as a basis for scoring, without considering the answers of the adaptive stage. We assumed that comprehensively considering the answers before and after review could produce a more accurate ability estimation. Therefore, this paper employed a new scoring method and attempted to deal with the Wainer strategy:

This study involved two experiments. Experiment 1 used the Monte Carlo method to simulate the entire process of CAT that allows the reviewing and changing of answers, with the aim of investigating the influence of different beta values on ability estimation. Experiment 2 used simulation data generated by the Monte Carlo method to evaluate the effectiveness of the Wainer strategy and attempted to deal with the strategy by using a new scoring method.
The simulation results of Experiment 1 indicated the following. First, comprehensively considering the answers before and after review did produce a more accurate ability estimation, and the most accurate estimates occurred when beta = 0.50. Second, the share of examinees who changed their answers was 66.80%; further, 6.40% of the answers were changed, and 75% of the modified answers represented changes from incorrect to correct answers. Experiment 2 indicated the following: When using the new scoring method, the ability estimates generated by the CAT involving the use of the Wainer strategy obviously diverged from the true ability values. Moreover, the bias increased as the true ability value increased.
The new scoring method employed in this study was not able to effectively deal with the Wainer strategy because of the abnormal ability estimates and abnormal estimated standard error. However, through a simulation experiment, we found the following: When beta = 0, comprehensively considering the expected a posteriori (EAP) and maximum likelihood estimation (MLE) ability estimates of CAT involving the use of the Wainer strategy succeeded in roughly dealing with the Wainer strategy. Our future task involves developing a more accurate method to deal with the Wainer Strategy

Key words: computerized adaptive testing, item review, Wainer strategy, Monte Carlo simulation

中图分类号: