Multidimentional Rasch Testlet Model: An Extension and Generalization of MRCMLM
WEI Dan1; LIU Hongyun2; ZHANG Danhui1
(1 Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, Beijing 100875, China) (2 School of Psychology, Beijing Normal University, Beijing 100875, China)
Abstract： Testlets have been widely used in educational assessment. It has been shown that ignoring testlet effects when analyzing response data often results in inaccurate estimates of reliability coefficients and latent trait standard errors, increased bias of item parameter estimates, inaccurate test equating, and failure to detect DIF. As such, there is increasing interest among researchers in using testlet models instead of standard item response models. Different types of testlet models have been proposed to partial out the influence of testlet factors from the estimation of latent proficiency. However, most of the previous models target testlet effects for which 1) only one latent trait is measured, and 2) each item belongs to only one testlet (between-item multidimensional). As an alternative, the two-tier model can be used to deal with multidimensional latent traits. However, the two-tier model is usually used within the framework of confirmatory factor analysis. This research extends the multidimensional random coefficients multinomial logistic model (MRCMLM) to the multidimensional testlet response model (MTRM), with the aim to take within-item multidimensional testlets and multiple ability into the consideration under IRT framework. With different model constraints, the MTRM can be used to model a variety of multidimensional test structures. Two studies based on simulated data and one empirical study based on a large-scale math assessment data are discussed. In simulation study 1, we considered different correlations among trait dimensions. We compared the MRCMLM which ignores the testlet effects with the MTRM in terms of the accuracy of estimation. In simulation study 2, the MTRM was compared to a two-tier model for polytomous data in terms of item and person parameter estimation accuracy. In the third study which analyzed real large-scale math test results, three-dimensional proficiencies in math were modeled and estimated. In total, seven testlets were identified. Some items were loaded on more than one testlets, indicating within-item multidimensional testlet effects. Model fit and estimation of three different models (MRCMLM, MTRM-1 with only uncrossed testlets considered, and MTRM-2 with all the seven testlets considered) were compared. All the analysis was conducted in ConQuest, using Monte Carlo estimation. Estimation accuracy in simulation studies was evaluated using bias, RMSE, and correlation coefficients between the true and estimated values. Results of simulation 1 indicated that the MTRM produced more accurate estimated item difficulties for items within testlets than the MRCMLM, while both models reached accurate results for independent items. It was also discovered that the recovery of item difficulties in the MTRM was less influenced by the correlations among the latent traits. In addition, as the correlation coefficients between abilities decreased, the ability and item difficulty estimates were more biased if testlet effects were not modeled. As discovered in simulation 2, both the MTRM and the two-tier model accurately estimated item and person parameters. When testlets effects were present, estimates of both item and person parameters in the MTRM were more stable than two-tier model, indicating that the MTRM is not influenced by complex test structures or extreme responses patterns. Results of the empirical data analysis showed that the MTRM with all seven testlets considered fits the data the best. The application of the MTRM reduces incorrect estimation of the reliability and standard error for each primary trait, even for moderate testlet effects and high correlations between ability dimensions. The present study proposes the multidimensional testlet model, supplementing previous testlet models by taking both within-item multidimensional testlets and multiple abilities into account. A new integrated model, the MTRM, was developed based on MRCMLM. This model can be applied to a variety of educational tests where complex testlets are embedded and multidimensional proficiencies are estimated, through identifying an appropriate ability-judge (score) matrix and testlet-judge (design) matrix. A promising attribute of this model is that the parameter estimation is easily achieved through using the software ConQuest. We suggest that in many assessment contexts, ignoring testlets effects can add ambiguity to the interpretation of test scores, thus data should be appropriately fitted to testlet models.