Please wait a minute...
Acta Psychologica Sinica    2012, Vol. 44 Issue (8) : 1124-1136     DOI:
|
Testing Measurement Equivalence of Categorical Items’ Threshold/Difficulty Parameters: A Comparison of CCFA and (M)IRT Approaches
LIU Hong-Yun;LI Chong;ZHANG Ping-Ping;LUO Fang
(1School of psychology, Beijing Normal University; Beijing Key Lab of Applied Experimental Psychology, Beijing 100875, China)
(2Beijing New Oriental School, Learning & Development Center, Beijing 100080, China)
(3National Key Laboratory of Cognitive Neuroscience and Learning, Beijing 100875, China)
Download: PDF(343 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks     Supporting Info
Abstract  Multiple group confirmatory factor analyses and differential item functioning basing on the unidimensional or the multidimensional item response theory were the two most commonly used methods to assess the measurement equivalence of categorical items. Unlike the traditional linear factor analysis, multiple-group categorical confirmatory factor analysis (CCFA) could model the categorical measures with a threshold structure appropriately, which is comparable to the difficulty parameters in the multidimensional IRT [(M)IRT)]. In this study, we compared the multiple-group categorical CFA (CCFA) and (M)IRT in terms of their power to detect violations of measurement invariance (i.e., DIF) with the Monte Carlo method. Moreover, given the limitation of the assumptions under the traditional unidimensional IRT model, this study extended the DIF test method to the (M)IRT model. Simulation studies under both unidimensional and multidimensional conditions were conducted to compare the DIFFTEST method, IRT-LR method (for unidimensional scale), and MIRT-MG (for multidimensional scale) with respect to their power to detect the lack of invariance across groups. Results indicated that the three methods, namely, DIFFTEST, IRT-LR, and MIRT-MG, showed reasonable power to identify the measurement non-equivalence when the difference of threshold was large. For unidimensional scale, the IRT-LR test demonstrated superior power to DIFFTEST. Whereas, for multidimensional scale, the results were not completely consistent across different conditions. The power of MIRT-MG was higher than that of DIFFTEST when test length was long and the correlation between dimensions was high. In contrast, the power of DIFFTEST was higher than that of MIRT-MG when test length was short and the correlations between dimensions were low. For a fixed number of noninvariant items, the power of the DIFFTEST method became smaller as the test length increased, whereas the power of the IRT-LR and MIRT-MG methods became larger as the test length increased. The number of respondents per group (sample size) was found to be one of the most important factors affecting the performance of these three approaches. The power of the DIFFTEST, IRT-LR, and, MIRT-MG methods would increase as the sample size increased. For a finite number of observations, the power of all three methods was larger under the balanced design when the two groups were equal in size than when two groups were unequal in size in the unbalanced design. For the DIFFTEST method, the Type I errors reached the nominal error rate at 5%, while the IRT-LR and MIRT-MG methods produced much lower Type I error rates.
Keywords categorical data      confirmatory factor analysis      differential item functioning      (multidimensional) item response theory      measurement equivalence     
Corresponding Authors: LUO Fang   
Issue Date: 28 August 2012
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
LIU Hong-Yun
LI Chong
ZHANG Ping-Ping
LUO Fang
Cite this article:   
LIU Hong-Yun,LI Chong,ZHANG Ping-Ping, et al. Testing Measurement Equivalence of Categorical Items’ Threshold/Difficulty Parameters: A Comparison of CCFA and (M)IRT Approaches[J]. Acta Psychologica Sinica, 2012, 44(8): 1124-1136.
URL:  
http://journal.psych.ac.cn/xlxb/EN/     OR     http://journal.psych.ac.cn/xlxb/EN/Y2012/V44/I8/1124
null
[1] LIU Yanlou; XIN Tao; LI Lingqing; TIAN Wei; LIU Xiaoxiao. An improved method for differential item functioning detection in cognitive diagnosis models: An application of Wald statistic based on observed information matrix[J]. Acta Psychologica Sinica, 2016, 48(5): 588-598.
[2] ZHAN Peida; CHEN Ping; BIAN Yufang. Using confirmatory compensatory multidimensional IRT models to do cognitive diagnosis[J]. Acta Psychologica Sinica, 2016, 48(10): 1347-1356.
[3] WANG Zhuoran; GUO Lei; BIAN Yufang. Comparison of DIF Detecting Methods in Cognitive Diagnostic Test[J]. Acta Psychologica Sinica, 2014, 46(12): 1923-1932.
[4] ZHANG Xun;LI Lingyan;LIU Hongyun;SUN Yan. Applying IRT_ΔB Procedure and Adapted LR Procedure to Detect DIF in Tests with Matrix Sampling[J]. Acta Psychologica Sinica, 2013, 45(8): 921-934.
[5] LIU Hong-Yun,LUO Fang,WANG Yue,ZHANG Yu. Item Parameter Estimation for Multidimensional Measurement: Comparisons of SEM and MIRT Based Methods[J]. , 2012, 44(1): 121-132.
[6] Kwok LEUNG,Fan ZHOU. Cross-Cultural Research Methods: Review and Prospect[J]. , 2010, 42(01): 41-47.
[7] Yang Yuhao,Long Junwei. The Structure and Measurement of Enterprise Staffs’ Knowledge-Sharing Behavior in China[J]. , 2008, 40(03): 350-357.
[8] Zou Hong,Jiang Suo. Development of the Adolescent Self-disclosure with Peers Questionnaire[J]. , 2008, 40(02): 184-192.
[9] Feng-Tingyong,Su-Ti,Hu-Xingwang,Li-Hong. The Development of A Test about Learning Adjustment of Undergraduate[J]. , 2006, 38(05): 762-769.
[10] Li-Chaoping,Xiaoxuan,Shi-Kan-,Chen-Xuefeng. Psychological Empowerment: Measurement and its Effect on Employees’ Work Attitude in China[J]. , 2006, 38(01): 99-106.
[11] Ma-Chao,Ling-Wenquan,Fang-Liluo. Construct Dimension of the Enterprise Staff’s Perceptions of Organizational Politics[J]. , 2006, 38(01): 107-115.
[12] Liu Wen,Yang Lizhu. STRUCTURE OF CHILDREN’S TEMPERAMENT AGED 3 TO 9 BASED ON TEACHERS’ DESCRIPTIONS[J]. , 2005, 37(01): 67-72.
[13] Wang Huaiming, Ma Mouchao. THE FACTORS OF CELEBRITY ENDORSER’S CREDIBILITY[J]. , 2004, 36(03): 365-369.
[14] Hua-Zhang,Lijuan-Pang,Sha-Tao,Yao-Chen,Qi-Dong. THE STRUCTURE OF CHILDREN’S EARLY MATHEMATICAL ABILITY AND ITS CHARACTERISTICS[J]. , 2003, 35(06): 810-817.
[15] Cao Yiwei. A CROSS-CULTURAL COMPARATIVE STUDY OF PERSONALITY: USING DIFFERENTIAL ITEM FUNCTIONING OF IRT[J]. , 2003, 35(01): 120-126.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Acta Psychologica Sinica
Support by Beijing Magtech