ISSN 0439-755X
CN 11-1911/B

Acta Psychologica Sinica ›› 2016, Vol. 48 ›› Issue (10): 1347-1356.

### Using confirmatory compensatory multidimensional IRT models to do cognitive diagnosis

ZHAN Peida; CHEN Ping; BIAN Yufang

1. (Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, Beijing 100875, China)
• Received:2015-12-21 Published:2016-10-25 Online:2016-10-25
• Contact: CHEN Ping, E-mail: pchen@bnu.edu.cn; BIAN Yufang, E-mail: bianyufang66@126.com

Abstract:

Traditional testing methods, such as classical testing theory or unidimensional item response theory models (UIRMs), typically provide a single sum score or overall ability. Advances in psychometrics have focused on measuring multiple dimensions of ability to provide more detailed and refined feedback for students. In recent years, cognitive diagnostic models (CDMs) have received great attention, particularly in the areas of educational and psychological measurement. The outcome of a DCM analysis is a profile of a set of attributes, α, also called a latent class, for each person; this provides cognitive diagnostic information about distinct skills underlying a test that students mastery or non-mastery. During the same period, another kind of models, multidimensional IRT models (MIRTMs), which also can provide fine-grained information about students’ strengths and weaknesses in the learning process were neglected. MIRTMs are different from CDMs in that latent variables in MIRTMs are continuous (namely, latent traits; θ) rather than categorical (typically binary). However, categorical variables in CDMs may be too rough to describe students’ skills when compared with the continuous latent traits in MIRTMs. Diagnostic measurement is the process of analyzing data from a diagnostic assessment for the purpose of making classification-based decisions. Currently, all testing method that have cognitive diagnostic function require substantive information about the attributes involved in specific items. Especially for CDMs, a confirmatory matrix that indicating which latent variables are required for an item, often referred to as Q matrix , is a essential term to analysis response data. Actually, such confirmatory matrices also exist in some multidimensional IRT models (MIRTMs), such as the scoring matrix in multidimensional random coefficients multinomial logit model. Therefore, it can be deduced that when MIRTMs are formulated in a confirmatory model defined by Q matrix, may also have diagnostic potential. Although some articles have noticed that viewpoint (e.g., Embretson & Yang, 2013; Stout, 2007; Wang & Nydick, 2015), no one really explored the diagnostic potential of confirmatory MIRTMs (C-MIRTMs). The main reason can be deduced that latent traits in MIRTMs are continuous, which can not be directly used to make classification-based diagnostic decisions. No matter MIRTMs or CDMs, multidimensional models normally can be specified into compensatory and non-compensatory models due to the relationship among dimensions. In compensatory models, students with high level on one dimension can compensate for lower levels on the other dimensions. Conversely, non-compensatory models assume that every dimensions are independent or partially independent with each others. Comparatively speaking, compensatory models are more general than non-compensatory models. Thus, only two compensatory models were concerned in this study, multidimensional 2-parameter logistic model (M2PLM) and linear logistic model (LLM) respectively, due to space limited. To explore the cognitive diagnostic function of MIRTMs, a confirmatory compensatory M2PLM (CC-M2PLM) were presented by introducing Q matrix in the item response function of M2PLM firstly. Then a cutoff point (CP) was used to transform estimated latent traits in CC-M2PLM to categorical variables (namely, trans-border attributes). This transformation step can be done after data analysis, thus two kinds of analysis results can be reported simultaneously: continuous latent traits and categorical trans-border attributes. Therefore, a suitable CP is very important, because of different CP will lead to different classification results. A simple pilot study was done to found the suitable CP: a test created with the CC-M2PLM but estimated with the LLM revealed that the LLM approximately divided the latent traits distribution in half, with a value of zero in IRT scale being the location of where masters (α = 1 if θ > 0) and non-masters (α = 0 if θ ≤ 0) were set. According to the result of pilot study, the CP was set equal to 0 for all dimensions (i.e., CPk = 0). Parameters in CC-M2PLM and LLM can be estimated by the mirt and CDM packages in R respectively. In simulation study, a series of simulations were conducted to evaluate cognitive diagnostic function of CC-M2PLM. The response data was generated by LLM, which can be treated as a diagnostic measurement dataset. CC-M2PLM and LLM were all used to fit that dataset, and results showed that the pattern (profile) correct classification ratio (PCCR) and the attribute correct classification ratio (ACCR) of trans-border attributes (from CC-M2PLM) and estimated attributes (from LLM) are almost same, the extent of most differences are smaller than 1%. Results of simulation study indicated that CC-M2PLM can be used to diagnostic measurement and its cognitive diagnostic function was as good as that of LLM. Finally, two empirical examples of diagnostic measurement were given to demonstrate applications and implications of the CC-M2PLM.