Please wait a minute...
心理学报  2017, Vol. 49 Issue (12): 1604-1614    DOI: 10.3724/SP.J.1041.2017.01604
  本期目录 | 过刊浏览 | 高级检索 |
 多维题组反应模型:多维随机系数多项 Logistic模型的应用拓展
 (1北京师范大学中国基础教育质量监测协同创新中心, 北京 100875) (2北京师范大学心理学院, 北京 100875)
 Multidimentional Rasch Testlet Model: An Extension and Generalization of MRCMLM
 WEI Dan1; LIU Hongyun2; ZHANG Danhui1
 (1 Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, Beijing 100875, China) (2 School of Psychology, Beijing Normal University, Beijing 100875, China)
全文: PDF(544 KB)   评审附件 (1 KB) 
输出: BibTeX | EndNote (RIS)       背景资料
摘要  本文将多维随机系数多项Logistic模型(MRCMLM)拓展应用到多维题组领域, 得到适用于多维目标能力和多维题组效应的多维题组反应模型(MTRM), 该模型具有高度灵活性和适用性。本文通过两个模拟研究和一个应用研究探究MTRM参数估计精度和模型适用性, 以及与two-tier模型的差异, 结果发现:(1)能力维度间相关和项目评分等级是影响模型参数估计的重要因素; (2) MTRM对项目参数估计准确性和稳定性高于two-tier模型, 对题组效应大小估计更为准确。(3) MTRM在考虑项目内多维题组情况下模型拟合度更高, 为测验分析提供了更广泛的模型结构选择, 具有显著的应用价值。
E-mail Alert
关键词 多维目标能力 多维题组 two-tier模型 MRCMLM 估计精度    
Abstract: Testlets have been widely used in educational assessment. It has been shown that ignoring testlet effects when analyzing response data often results in inaccurate estimates of reliability coefficients and latent trait standard errors, increased bias of item parameter estimates, inaccurate test equating, and failure to detect DIF. As such, there is increasing interest among researchers in using testlet models instead of standard item response models. Different types of testlet models have been proposed to partial out the influence of testlet factors from the estimation of latent proficiency. However, most of the previous models target testlet effects for which 1) only one latent trait is measured, and 2) each item belongs to only one testlet (between-item multidimensional). As an alternative, the two-tier model can be used to deal with multidimensional latent traits. However, the two-tier model is usually used within the framework of confirmatory factor analysis. This research extends the multidimensional random coefficients multinomial logistic model (MRCMLM) to the multidimensional testlet response model (MTRM), with the aim to take within-item multidimensional testlets and multiple ability into the consideration under IRT framework. With different model constraints, the MTRM can be used to model a variety of multidimensional test structures. Two studies based on simulated data and one empirical study based on a large-scale math assessment data are discussed. In simulation study 1, we considered different correlations among trait dimensions. We compared the MRCMLM which ignores the testlet effects with the MTRM in terms of the accuracy of estimation. In simulation study 2, the MTRM was compared to a two-tier model for polytomous data in terms of item and person parameter estimation accuracy. In the third study which analyzed real large-scale math test results, three-dimensional proficiencies in math were modeled and estimated. In total, seven testlets were identified. Some items were loaded on more than one testlets, indicating within-item multidimensional testlet effects. Model fit and estimation of three different models (MRCMLM, MTRM-1 with only uncrossed testlets considered, and MTRM-2 with all the seven testlets considered) were compared. All the analysis was conducted in ConQuest, using Monte Carlo estimation. Estimation accuracy in simulation studies was evaluated using bias, RMSE, and correlation coefficients between the true and estimated values. Results of simulation 1 indicated that the MTRM produced more accurate estimated item difficulties for items within testlets than the MRCMLM, while both models reached accurate results for independent items. It was also discovered that the recovery of item difficulties in the MTRM was less influenced by the correlations among the latent traits. In addition, as the correlation coefficients between abilities decreased, the ability and item difficulty estimates were more biased if testlet effects were not modeled. As discovered in simulation 2, both the MTRM and the two-tier model accurately estimated item and person parameters. When testlets effects were present, estimates of both item and person parameters in the MTRM were more stable than two-tier model, indicating that the MTRM is not influenced by complex test structures or extreme responses patterns. Results of the empirical data analysis showed that the MTRM with all seven testlets considered fits the data the best. The application of the MTRM reduces incorrect estimation of the reliability and standard error for each primary trait, even for moderate testlet effects and high correlations between ability dimensions. The present study proposes the multidimensional testlet model, supplementing previous testlet models by taking both within-item multidimensional testlets and multiple abilities into account. A new integrated model, the MTRM, was developed based on MRCMLM. This model can be applied to a variety of educational tests where complex testlets are embedded and multidimensional proficiencies are estimated, through identifying an appropriate ability-judge (score) matrix and testlet-judge (design) matrix. A promising attribute of this model is that the parameter estimation is easily achieved through using the software ConQuest. We suggest that in many assessment contexts, ignoring testlets effects can add ambiguity to the interpretation of test scores, thus data should be appropriately fitted to testlet models.
Key wordsmultidimensional ability    multidimensional testlet    two-tier model    MRCMLM    estimated accuracy
收稿日期: 2016-11-23      出版日期: 2017-10-25
基金资助: 全国教育科学“十二五”规划2013年度教育部青年课题(EBA130370)资助。
通讯作者: 张丹慧, E-mail:   
魏丹, 刘红云, 张丹慧.  多维题组反应模型:多维随机系数多项 Logistic模型的应用拓展[J]. 心理学报, 2017, 49(12): 1604-1614.
WEI Dan, LIU Hongyun, ZHANG Danhui.  Multidimentional Rasch Testlet Model: An Extension and Generalization of MRCMLM. Acta Psychologica Sinica, 2017, 49(12): 1604-1614.
链接本文:      或
[1] 詹沛达;李晓敏;王文中;边玉芳;王立君. 多维题组效应认知诊断模型[J]. 心理学报, 2015, 47(5): 689-701.
[2] 詹沛达;王文中;王立君;李晓敏. 多维题组效应Rasch模型[J]. 心理学报, 2014, 46(8): 1208-1222.
[3] 杜文久;周娟;李洪波. 二参数逻辑斯蒂模型项目参数的估计精度[J]. 心理学报, 2013, 45(10): 1179-1186.
[4] 丁树良,罗芬,戴海琦,朱玮. 多题多做测验模型及其应用[J]. 心理学报, 2007, 39(04): 730-736.
Full text



版权所有 © 《心理学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发  技术支持