ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2021, Vol. 53 ›› Issue (9): 1044-1058.doi: 10.3724/SP.J.1041.2021.01044

• 研究论文 • 上一篇    

两种新的多维计算机化分类测验终止规则

任赫, 陈平()   

  1. 北京师范大学中国基础教育质量监测协同创新中心, 北京 100875
  • 收稿日期:2020-06-04 出版日期:2021-09-25 发布日期:2021-07-22
  • 通讯作者: 陈平 E-mail:pchen@bnu.edu.cn
  • 基金资助:
    国家自然科学基金面上项目(32071092);中国基础教育质量监测协同创新中心基础教育质量监测科研基金项目(2019-01-082-BZK01);中国基础教育质量监测协同创新中心基础教育质量监测科研基金项目(2019-01-082-BZK02);中国基础教育质量监测协同创新中心自主课题(BJZK-2019A2-19003)

Two new termination rules for multidimensional computerized classification testing

REN He, CHEN Ping()   

  1. Collaborative Innovation Center of Assessment for Basic Education Quality, Beijing Normal University, Beijing 100875, China
  • Received:2020-06-04 Online:2021-09-25 Published:2021-07-22
  • Contact: CHEN Ping E-mail:pchen@bnu.edu.cn

摘要:

计算机化分类测验(Computerized Classification Testing, CCT)由于具备分类的功能, 目前在职业资格考试、健康与护理问卷等以分类为目的的测验中得到广泛应用。作为CCT的重要组成部分, 终止规则不仅决定测验停止的条件而且直接影响分类准确率及测验效率。然而, 目前少有研究对多维CCT (Mulitidimensional CCT, MCCT)的终止规则进行探索。针对已有MCCT终止规则的不足, 提出两种新的MCCT终止规则(即基于马氏距离的多维序贯似然比规则Mahalanobis-SPRT和随机缩减的多维广义似然比规则M-SCGLR), 并开展模拟研究在不同实验条件下(比如, 不同的题库结构、能力维度间相关及分界函数)考查它们的表现。结果表明:(1)在使用补偿性分界函数的条件下, Mahalanobis-SPRT规则具有较高的分类精度和与同类方法相近的测验长度; (2)在几乎所有实验条件下, M-SCGLR规则不仅在测验精度上大幅优于已有的多维随机缩减规则, 而且具有较短的测验长度。

关键词: 计算机化分类测验, 终止规则, 多维项目反应理论, 马氏距离, 随机缩减

Abstract:

Computerized classification testing (CCT) is a subset of computerized adaptive testing (CAT), and it aims to classify examinees into one of at least two possible categories that denote results such as pass/fail or non-mastery/partial mastery/mastery. Therefore, CCTs focus on increasing the accuracy of classification which is different from CATs designed for precise measurement. The termination rule is one of the key components of CCT. However, as pointed out by Nydick (2013), most CCTs (i.e., UCCTs) were designed under unidimensional item response theory (IRT), in which the unidimensionality assumption is easily violated in practice. Thus, researchers then began to construct multidimensional CCT termination rules (i.e., MCCT) based on multidimensional IRT. To date, however, these rules still have some deficiencies in terms of classification accuracy or test efficiency.

Most current studies on termination rules of MCCT are based on termination rules of UCCT. In UCCTs, termination rules require setting a cut point, ${{\theta }_{0}}$, of the latent trait to calculate the statistics; and when they are extended from UCCT to MCCT, the cut point will become a classification bound curve or even a surface (i.e., $g(\theta )=0$). At this time, a question is how to convert the curve or surface into ${{\theta }_{0}}$. To this end, the projected sequential probability ratio test (P-SPRT), constrained SPRT (C-SPRT; Nydick, 2013), and multidimensional generalized likelihood ratio (M-GLR) were respectively proposed to solve the problem in different ways. Among them, P-SPRT and C-SPRT choose specific points on g(θ) as the approximate cut point, ${{\hat{\theta }}_{0}}$, by projecting into Euclidean space or constraining on g(θ) respectively; as for M-GLR, because the generalized likelihood ratio statistic can be calculated without a cut point, it can be directly employed in MCCT. To overcome the limitation that P-SPRT may lead to unstable results at the beginning of the test, this study proposed the Mahalanobis distance-based SPRT (Mahalanobis-SPRT).

In addition, stochastic curtailment is a technique for shortening the test length by predicting whether the classification of participants will change as the test continues. This article also combined M-GLR with the stochastic curtailment and proposed M-GLR with stochastic curtailment (M-SCGLR).

A full-scale simulation study was conducted to (1) compare both the Mahalanobis-SPRT and M-SCGLR with the P-SPRT, C-SPRT, M-GLR, and multidimensional stochastically curtailed SPRT (M-SCSPRT) under varying conditions; (2) compare the classification performance of the above six termination rules for participants with specific abilities to explore whether there is a significant difference in the sensitivity of various rules to classify specific participants. To achieve the first research objective, three levels of correlation between dimensions (ρ=0, 0.5, and 0.8), two item bank structures (within-item multidimensionality and between-item multidimensionality), and two kinds of classification boundary (compensatory boundary and non-compensatory boundary) were considered; to achieve the second objective, 36 specific ability points $({{\theta }_{1}},{{\theta }_{2}})$ were generated where ${{\theta }_{1}},{{\theta }_{2}}\in \{-0.5,-0.3,-0.1,0.1,0.3,0.5\}$. The results showed that: (1) when the compensatory classification function was used, the Mahalanobis-SPRT led to higher classification accuracy and similar test length to the rules without stochastic curtailment; (2) under almost all conditions, the M-SCGLR not only possessed higher precision but also maintained the short test length, compared to M-SCSPRT that also uses stochastic curtailment; (3) the six termination rules showed a consistent change in the sensitivity of the precision and test length to specific participants.

To sum up, two new MCCT termination rules (Mahalanobis-SPRT and M-SCGLR) are put forward in this article. Although the simulation results are very promising, several research directions merit further investigation, such as the development of MCCT termination rules for more than two categories, and the construction of MCCT termination rules by incorporating process data like the response time.

Key words: computerized classification testing, termination rule, multidimensional item response theory, Mahalanobis distance, stochastic curtailment

中图分类号: