ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

心理科学进展 ›› 2023, Vol. 31 ›› Issue (10): 1966-1980.doi: 10.3724/SP.J.1042.2023.01966

• 研究方法 • 上一篇    

测验模式效应:来源、检测与应用

陈平(), 代艺, 黄颖诗   

  1. 北京师范大学中国基础教育质量监测协同创新中心, 北京 100875
  • 收稿日期:2023-01-10 出版日期:2023-10-15 发布日期:2023-07-25
  • 通讯作者: 陈平, E-mail: pchen@bnu.edu.cn
  • 基金资助:
    国家自然科学基金面上项目(32071092);北京师范大学中国基础教育质量监测协同创新中心自主课题(2022-01-082-BZK01)

Test mode effect: Sources, detection, and applications

CHEN Ping(), DAI Yi, HUANG Yingshi   

  1. Collaborative Innovation Center of Assessment for Basic Education Quality, Beijing Normal University, Beijing 100875, China
  • Received:2023-01-10 Online:2023-10-15 Published:2023-07-25

摘要:

测验模式效应(Test Mode Effect, TME)是指同一测验采用不同测验形式施测而产生的测验功能差异。TME的存在会对测验公平、选拔标准和测验等值等产生影响, 因此对TME进行准确检测和合理解释具有重要意义。通过对TME的来源、检测(包括实验设计和检测方法)以及研究结果进行系统梳理, 全面展示TME研究的方法论。对TME模型进行进一步解释、对TME研究中的测验形式进行拓展以及将TME的研究成果应用于我国的大规模教育测评项目, 都是TME领域的未来重要发展方向。

关键词: 测验模式效应, 测验公平, 测量不变性, 计算机测验

Abstract:

International large-scale assessment programs (e.g., PISA, TIMSS, and NAEP), as well as small classroom tests, are increasingly using computers to administer tests. The test mode is undergoing a transformation from the traditional “paper-based testing (PBT)” to “computer-based testing (CBT)”. Before transforming the test mode, researchers and practitioners face a key issue: when the same test is administered with different test modes (such as PBT and CBT), the test results are not necessarily the same, so they cannot be blindly compared directly. Such difference in test function caused by the administration of the same test in different test modes is referred to as the test mode effect (TME). The existence of TME will have an impact on test fairness, selection criteria and test equating, so it is of great significance to accurately detect and interpret TME.

This review aims to systematically sort out the whole process of TME from “generation to detection”, and to grasp the research ideas and development trends of TME by summarizing or commenting or comparing the source, experimental design and detection methods of TME. Specifically, firstly, the source of TME was sorted out from the test level, item level, subject level and rater level respectively, and how the differences in these four levels lead to the generation of TME was analyzed; secondly, three experimental designs (between-group design, within-group design, and balanced incomplete block design) used to control for subject characteristics in TME research were outlined, along with their applicable scenarios; thirdly, four TME detection methods were introduced, they were ANOVA (Analysis of Variance), MCFA (multi-group confirmatory factor analysis), DIF (differential item functioning) and MEM (mode effect model); and their pros and cons, scope of application, and implementation methods were summarized and commented as well; finally, the research results in the field of TME over the past 40 years were summarized and analyzed.

Several future directions for research on TME can be identified. First, the interpretability and applicability of the MEM method can be further enhanced by including factors related to the source of TME in the existing MEM. Second, the range of test modes for TME research should be expanded. That is, TME may also occur between PBT and other test modes, including mobile-based assessment, phone or face-to-face interview, game-based assessment, and virtual and augmented reality-based assessments. Third, the rich TME research results can be applied to large-scale educational assessment programs in China to promote the use and development of CBT.

In summary, this article (1) proceeds from practical issues, emphasizing the impact of TME on test fairness, selection criteria and test equating, which will help arouse the attention of test users; (2) systematically introduces and compares four methods for detecting TME, providing references for researchers to choose and use in practice; (3) sorts out the source, detection (including experimental design and detection methods) and future research directions of TME, providing a complete research idea for follow-up research.

Key words: test mode effect, test fairness, measurement invariance, computer-based testing

中图分类号: