ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

Advances in Psychological Science ›› 2023, Vol. 31 ›› Issue (10): 1966-1980.doi: 10.3724/SP.J.1042.2023.01966

• Research Method • Previous Articles    

Test mode effect: Sources, detection, and applications

CHEN Ping(), DAI Yi, HUANG Yingshi   

  1. Collaborative Innovation Center of Assessment for Basic Education Quality, Beijing Normal University, Beijing 100875, China
  • Received:2023-01-10 Online:2023-10-15 Published:2023-07-25

Abstract:

International large-scale assessment programs (e.g., PISA, TIMSS, and NAEP), as well as small classroom tests, are increasingly using computers to administer tests. The test mode is undergoing a transformation from the traditional “paper-based testing (PBT)” to “computer-based testing (CBT)”. Before transforming the test mode, researchers and practitioners face a key issue: when the same test is administered with different test modes (such as PBT and CBT), the test results are not necessarily the same, so they cannot be blindly compared directly. Such difference in test function caused by the administration of the same test in different test modes is referred to as the test mode effect (TME). The existence of TME will have an impact on test fairness, selection criteria and test equating, so it is of great significance to accurately detect and interpret TME.

This review aims to systematically sort out the whole process of TME from “generation to detection”, and to grasp the research ideas and development trends of TME by summarizing or commenting or comparing the source, experimental design and detection methods of TME. Specifically, firstly, the source of TME was sorted out from the test level, item level, subject level and rater level respectively, and how the differences in these four levels lead to the generation of TME was analyzed; secondly, three experimental designs (between-group design, within-group design, and balanced incomplete block design) used to control for subject characteristics in TME research were outlined, along with their applicable scenarios; thirdly, four TME detection methods were introduced, they were ANOVA (Analysis of Variance), MCFA (multi-group confirmatory factor analysis), DIF (differential item functioning) and MEM (mode effect model); and their pros and cons, scope of application, and implementation methods were summarized and commented as well; finally, the research results in the field of TME over the past 40 years were summarized and analyzed.

Several future directions for research on TME can be identified. First, the interpretability and applicability of the MEM method can be further enhanced by including factors related to the source of TME in the existing MEM. Second, the range of test modes for TME research should be expanded. That is, TME may also occur between PBT and other test modes, including mobile-based assessment, phone or face-to-face interview, game-based assessment, and virtual and augmented reality-based assessments. Third, the rich TME research results can be applied to large-scale educational assessment programs in China to promote the use and development of CBT.

In summary, this article (1) proceeds from practical issues, emphasizing the impact of TME on test fairness, selection criteria and test equating, which will help arouse the attention of test users; (2) systematically introduces and compares four methods for detecting TME, providing references for researchers to choose and use in practice; (3) sorts out the source, detection (including experimental design and detection methods) and future research directions of TME, providing a complete research idea for follow-up research.

Key words: test mode effect, test fairness, measurement invariance, computer-based testing

CLC Number: