ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2017, Vol. 49 ›› Issue (12): 1604-1614.doi: 10.3724/SP.J.1041.2017.01604

• • 上一篇    下一篇

 多维题组反应模型:多维随机系数多项 Logistic模型的应用拓展

魏丹1;刘红云2;张丹慧1   

  1.  (1北京师范大学中国基础教育质量监测协同创新中心, 北京 100875) (2北京师范大学心理学院, 北京 100875)
  • 收稿日期:2016-11-23 发布日期:2017-10-25 出版日期:2017-12-25
  • 通讯作者: 张丹慧, E-mail: 09022@bnu.edu.cn
  • 基金资助:
     全国教育科学“十二五”规划2013年度教育部青年课题(EBA130370)资助。

 Multidimentional Rasch Testlet Model: An Extension and Generalization of MRCMLM

 WEI Dan1; LIU Hongyun2; ZHANG Danhui1   

  1.  (1 Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, Beijing 100875, China) (2 School of Psychology, Beijing Normal University, Beijing 100875, China)
  • Received:2016-11-23 Online:2017-10-25 Published:2017-12-25
  • Contact: ZHANG Danhui, E-mail: 09022@bnu.edu.cn
  • Supported by:
     

摘要:  本文将多维随机系数多项Logistic模型(MRCMLM)拓展应用到多维题组领域, 得到适用于多维目标能力和多维题组效应的多维题组反应模型(MTRM), 该模型具有高度灵活性和适用性。本文通过两个模拟研究和一个应用研究探究MTRM参数估计精度和模型适用性, 以及与two-tier模型的差异, 结果发现:(1)能力维度间相关和项目评分等级是影响模型参数估计的重要因素; (2) MTRM对项目参数估计准确性和稳定性高于two-tier模型, 对题组效应大小估计更为准确。(3) MTRM在考虑项目内多维题组情况下模型拟合度更高, 为测验分析提供了更广泛的模型结构选择, 具有显著的应用价值。

关键词: 多维目标能力, 多维题组, two-tier模型, MRCMLM, 估计精度

Abstract:  Testlets have been widely used in educational assessment. It has been shown that ignoring testlet effects when analyzing response data often results in inaccurate estimates of reliability coefficients and latent trait standard errors, increased bias of item parameter estimates, inaccurate test equating, and failure to detect DIF. As such, there is increasing interest among researchers in using testlet models instead of standard item response models. Different types of testlet models have been proposed to partial out the influence of testlet factors from the estimation of latent proficiency. However, most of the previous models target testlet effects for which 1) only one latent trait is measured, and 2) each item belongs to only one testlet (between-item multidimensional). As an alternative, the two-tier model can be used to deal with multidimensional latent traits. However, the two-tier model is usually used within the framework of confirmatory factor analysis. This research extends the multidimensional random coefficients multinomial logistic model (MRCMLM) to the multidimensional testlet response model (MTRM), with the aim to take within-item multidimensional testlets and multiple ability into the consideration under IRT framework. With different model constraints, the MTRM can be used to model a variety of multidimensional test structures. Two studies based on simulated data and one empirical study based on a large-scale math assessment data are discussed. In simulation study 1, we considered different correlations among trait dimensions. We compared the MRCMLM which ignores the testlet effects with the MTRM in terms of the accuracy of estimation. In simulation study 2, the MTRM was compared to a two-tier model for polytomous data in terms of item and person parameter estimation accuracy. In the third study which analyzed real large-scale math test results, three-dimensional proficiencies in math were modeled and estimated. In total, seven testlets were identified. Some items were loaded on more than one testlets, indicating within-item multidimensional testlet effects. Model fit and estimation of three different models (MRCMLM, MTRM-1 with only uncrossed testlets considered, and MTRM-2 with all the seven testlets considered) were compared. All the analysis was conducted in ConQuest, using Monte Carlo estimation. Estimation accuracy in simulation studies was evaluated using bias, RMSE, and correlation coefficients between the true and estimated values. Results of simulation 1 indicated that the MTRM produced more accurate estimated item difficulties for items within testlets than the MRCMLM, while both models reached accurate results for independent items. It was also discovered that the recovery of item difficulties in the MTRM was less influenced by the correlations among the latent traits. In addition, as the correlation coefficients between abilities decreased, the ability and item difficulty estimates were more biased if testlet effects were not modeled. As discovered in simulation 2, both the MTRM and the two-tier model accurately estimated item and person parameters. When testlets effects were present, estimates of both item and person parameters in the MTRM were more stable than two-tier model, indicating that the MTRM is not influenced by complex test structures or extreme responses patterns. Results of the empirical data analysis showed that the MTRM with all seven testlets considered fits the data the best. The application of the MTRM reduces incorrect estimation of the reliability and standard error for each primary trait, even for moderate testlet effects and high correlations between ability dimensions. The present study proposes the multidimensional testlet model, supplementing previous testlet models by taking both within-item multidimensional testlets and multiple abilities into account. A new integrated model, the MTRM, was developed based on MRCMLM. This model can be applied to a variety of educational tests where complex testlets are embedded and multidimensional proficiencies are estimated, through identifying an appropriate ability-judge (score) matrix and testlet-judge (design) matrix. A promising attribute of this model is that the parameter estimation is easily achieved through using the software ConQuest. We suggest that in many assessment contexts, ignoring testlets effects can add ambiguity to the interpretation of test scores, thus data should be appropriately fitted to testlet models.

Key words: multidimensional ability, multidimensional testlet, two-tier model, MRCMLM, estimated accuracy

中图分类号: