ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

Acta Psychologica Sinica ›› 2016, Vol. 48 ›› Issue (12): 1600-1611.doi: 10.3724/SP.J.1041.2016.01600

Previous Articles     Next Articles

The optimization of test design in Cognitive Diagnostic Assessment

PENG Yafeng1; LUO Zhaosheng1; YU Xiaofeng1; GAO Chunlei1; LI Yujun2   

  1. (1 School of psychology, Jiangxi Normal University, Nanchang 330022, China) (2 Center for Studies of Psychological Application/School of Psychology, South China Normal University, Guangzhou 510631, China)
  • Received:2016-03-29 Online:2016-12-24 Published:2016-12-24
  • Contact: LUO Zhaosheng, E-mail: luozs@126.com

Abstract:

Cognitive diagnostic assessment (CDA) is designed to measure specific knowledge structures and processing skills of students so as to provide information about their cognitive strengths and weaknesses. The Q matrix is the base component and core element in CDA that characterizes the design of test construct and continent, and has a direct impact on the classification efficiency of CDA. In this article, we examined how the characteristics of Q matrix design would affect the performance of CDA. In the Monte Carlo simulation study, the mean value and standard deviation of pattern match ratio are used to evaluate the classification accuracy and stability of CDA correspondingly. In the study, six attribute hierarchical structures (Linear, Convergent, Divergent, Unstructured, Independent and Mixture) are simulated. The results show that: (1) the classification accuracy becomes higher when the test is longer; however, a "ceiling effect" of classification accuracy emerges when the test length reaches a certain value; (2) the number of R* (the matrix that has same elements as the Reachable matrix) in the Q matrix affects the test’s classification accuracy and stability. The Q matrix design leads to higher stability with more R* included, and the Q matrix with a maximum odd number of R* has the highest classification accuracy; (3) the average number of attributes measured within each item has an effect on the classification accuracy and stability, and it varies across different attribute hierarchy structures. From the results, we have some recommendations on test design under different attribute hierarchy structures in CDA, summarized as follow: (1) the optimal test length and NR* is four times the number of attribute and one R* for Linear and Convergent, five times and one R* for Divergent, six times and three R* for Unstructured, six times and five R* for Independent, and six times and two R* for Mixture respectively; (2) the design of attributes measured in items excluding R* varies across different attribute hierarchy structures. (a) For Linear, every pattern in the set of potential item should be measured equally (the set of potential items is considered as a pool of items that probes all combinations of attributes under the corresponding attribute hierarchy structure); (b) for Convergent, the attributes measured in the items should be mainly on each path of the convergent branch, with their prerequisite attributes, and for the whole hierarchy structure in that sequence; (c) for the Divergent structure, the attributes measured in the items besides R* should be mainly the combinations of the attributes on each path of the divergent branch; (d) the combinations of the attribute and its prerequisite attribute are preferred under Unstructured; (e) for Independent, the combinations of any two attributes is recommended; (f) for Mixture, the suggestions discussed above under each hierarchy structure can be used as the reference in building the specific hierarchy structure parts among attributes.

Key words: cognitive diagnostic assessment, Q matrix, the design of test construct, classification accuracy, classification stability