New item selection methods in cognitive diagnostic computerized adaptive testing: Combining item discrimination indices
GUO Lei1,2,3; ZHENG Chanjin4; BIAN Yufang5; SONG Naiqing3,6; XIA Lingxiang1
(1 Faculty of Psychology, Southwest University, Chongqing 400715, China) (2 Postdoctoral Research Center for Statistics, Southwest University, Chongqing 400715, China) (3 Southwest University Branch, Collaborative Innovation Center of Assessment toward Basic Education Quality, Chongqing 400715, China) (4 School of Psychology, Jiangxi Normal University, Nanchang 330022, China) (5 Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, Beijing 100875, China) (6 Center for Basic Education Research, Southwest University, Chongqing 400715, China)
Interest in developing computerized adaptive testing (CAT) under cognitive diagnostic models has increased recently. Cognitive diagnostic computerized adaptive testing (CD-CAT) attempt to classify examinees into the correct latent class profile so as to pinpoint the strengths and weaknesses of each examinee whereas CAT algorithms choose items from the item bank to achieve that goal as efficiently as possible. Most of the research in CD-CAT uses the posterior-weighted Kullback-Leibler (PWKL) index due to its high efficiency. The PWKL index integrated the posterior probabilities of examinees’ latent class profiles into the KL information, and thus improved item selection efficiency considerably. However, the PWKL index only used examinee-based information to assess the relative importance of each latent class profile. The current study attempted to take advantage of not only the examinee-base information but also the item-based information that could be readily obtained from items. In a sense, the PWKL index should be regarded as single-source index. This paper introduced four new multiple-source item selection methods, GIDPWKL, AIDPWKL, CIDPWKL, and KLEDPWKL respectively, which can be modified from the PWKL index by combining the item discrimination information. Two simulation studies were conducted to evaluate the new methods’ efficiency against the PWKL index and mutual information (MI) index in the DINA model with the exposure control. The effects of different factors were investigated: the Q matrix structure (simple vs. complex), item quality (high vs. low) and test length (moderate vs. short). Simulation results indicated that: (1) In most cases, the shorter the test length was, the higher AACCR and PCCR values the four new methods would have in the fix-length test. The GIDPWKL index had the highest average attribute correct classification rate and pattern correct classification rate among the six methods, and followed by AIDPWKL index. The performance among the CIDPWKL, KLEDPWKL, and MI depends on the experimental conditions. (2) In most cases, the higher the item quality was, the more advantage the four new methods would have in the fix-length test. (3) The structure of the Q matrix affected the performance of different item selection methods. (4) In the variable-length test, the mean of test length across all examinees for the four new methods and MI method were all smaller than those in the PWKL method. As a whole, the performance of the GIDPWKL index was the best, and should be recommended in practice where had the similar testing scenarios.