模型参数点估计的可靠性：以CDM为例

doi:10.3724/SP.J.1041.2023.01712

摘要/Abstract

摘要：

心理学研究中, 不恰当的模型参数估计框架或收敛准则严重影响模型参数点估计的可靠性, 进而影响到研究结论的可靠性。本研究提出了基于MLE-EM的CDM模型参数估计新框架, 以及新收敛判断方法。通过模拟研究与实证数据分析的方式, 探索了新参数估计框架和新收敛判断方法的表现, 并与已有模型参数估计框架及收敛判断方法进行了比较。结果显示, 新的模型参数估计框架及收敛准则的表现优于已有的模型参数估计框架及收敛准则, 能有效提高模型参数点估计的可靠性。

关键词: 参数估计, 点估计, 收敛准则, 认知诊断模型

Abstract:

Cognitive diagnostic models (CDMs) are psychometric models that have received increasing attention within fields such as psychology, education, sociology, and biology. It has been argued that an inappropriate convergence criterion for a maximum likelihood estimation using the expectation maximization (MLE-EM) algorithm could result in unpredictable and inaccurate model parameter estimates. Thus, inappropriate convergence criteria may yield unstable and misleading conclusions from the fitted CDMs. Although several convergence criteria have been developed, it remains an unexplored question, how to specify the appropriate convergence criterion for fitted CDMs.

A comprehensive method for assessing convergence is proposed in this study. To minimize the influence of the model parameter estimation framework, a new framework adopting the multiple starting values strategy (mCDM) is introduced. To examine the performance of the convergence criterion for MLE-EM in CDMs, a simulation study under various conditions was conducted. Five convergence assessment methods were examined: the maximum absolute change in model parameters, the maximum absolute change in item endorsement probabilities and structural parameters, the absolute change in log-likelihood, the relative log-likelihood, and the comprehensive method. The data generating models were the saturated CDM and the hierarchical CDM. The number of items was set to J = 16 and 32. Three levels of sample sizes were considered: 500, 1000, and 4000. The three convergence tolerance value conditions were 10^-4, 10^-6, and 10^-8. The simulated response data were fitted by the saturated CDM using the mCDM and the R package GDINA. The maximum number of iterations was set to 50000.

The simulation results suggest the following.

(1) The saturated CDM converged under all conditions. However, the actual number of iterations exceeded 30000 under some conditions, implying that when the predefined maximum iteration number is less than 30000, the MLE-EM algorithm might inadvertently stop.

(2) The model parameter estimation framework affected the performance of the convergence criteria. The performance of the convergence criteria under the mCDM framework was comparable or superior to that of the GDINA framework.

(3) Regarding the convergence tolerance values considered in this study, 10^-8 consistently had the best performance in providing the maximum value of the log-likelihood and 10^-4 had the worst performance. Compared to all other convergence assessment methods, the comprehensive method in general had the best performance, especially under the mCDM framework. The performance of the maximum absolute change in model parameters was similar to the comprehensive method, but this good performance was not consistent. On the contrary, the relative log-likelihood had the worst performance under the mCDM and GDINA frameworks.

The simulation results showed that the most appropriate convergence criterion for MLE-EM in CDMs was the comprehensive method with tolerance 10^-8 under the mCDM framework. The results from the real data analysis also demonstrated that the proposed comprehensive method and mCDM framework had good performance.

Key words: model parameter estimation, point estimation, convergence criterion, cognitive diagnostic model

中图分类号:

B841

刘彦楼, 陈启山, 王一鸣, 姜晓彤. (2023). 模型参数点估计的可靠性：以CDM为例. 心理学报, 55(10), 1712-1728.

LIU Yanlou, CHEN Qishan, WANG Yiming, JIANG Xiaotong. (2023). On the reliability of point estimation of model parameters: Taking cognitive diagnostic models as an example. Acta Psychologica Sinica, 55(10), 1712-1728.

图/表 11

图1 单个参数的局部最优解或全局最优解的简单示例

图2 对数似然函数差收敛判断方法可能缺陷的简单示例

图3 模拟研究中J = 16的Q矩阵

表1 饱和CDM生成数据, J = 16, N = 500条件下的模拟结果

收敛准则		$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
Gdp4	0	−4948.024	−4847.235	−5054.561	34.436	0.540	180	848	62
Gdp6	240	−4948.011	−4847.226	−5054.557	34.437	1.181	474	5752	61
Gdp8	280	−4948.011	−4847.226	−5054.557	34.437	2.068	901	32057	61
Gip4	0	−4948.027	−4847.234	−5054.561	34.438	0.507	164	730	59
Gip6	232	−4948.011	−4847.226	−5054.557	34.437	1.131	452	5680	61
Gip8	279	−4948.011	−4847.226	−5054.557	34.437	1.847	863	28030	61
Gll4	0	−4948.024	−4847.229	−5054.558	34.438	0.520	169	844	60
Gll6	48	−4948.017	−4847.226	−5054.557	34.431	0.858	329	1819	61
Gll8	273	−4948.011	−4847.226	−5054.557	34.437	1.217	531	6760	61
Gcomp4	0	−4948.022	−4847.229	−5054.558	34.436	0.566	190	848	62
Gcomp6	240	−4948.011	−4847.226	−5054.557	34.437	1.189	478	5752	61
Gcomp8	281	−4948.011	−4847.226	−5054.557	34.437	2.062	905	32057	61
mdp4	0	−4948.021	−4847.234	−5054.560	34.436	0.254	179	877	59
mdp6	360	−4948.008	−4847.226	−5054.556	34.437	0.461	479	5803	59
mdp8	498	−4948.008	−4847.226	−5054.556	34.437	0.735	953	32053	59
mip4	0	−4948.022	−4847.234	−5054.560	34.436	0.241	165	774	58
mip6	346	−4948.012	−4847.226	−5054.556	34.441	0.432	453	5730	59
mip8	496	−4948.008	−4847.226	−5054.556	34.437	0.690	912	28026	59
mll4	0	−4948.021	−4847.228	−5054.557	34.437	0.240	168	923	57
mll6	69	−4948.018	−4847.226	−5054.556	34.435	0.349	335	1978	59
mll8	485	−4948.008	−4847.226	−5054.556	34.437	0.495	585	6756	59
mcomp4	0	−4948.019	−4847.228	−5054.557	34.435	0.258	189	923	59
mcomp6	363	−4948.008	−4847.226	−5054.556	34.437	0.462	485	5803	59
mcomp8	500	−4948.008	−4847.226	−5054.556	34.437	0.734	958	32053	59

表1 饱和CDM生成数据, J = 16, N = 500条件下的模拟结果

收敛准则		$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
Gdp4	0	−4948.024	−4847.235	−5054.561	34.436	0.540	180	848	62
Gdp6	240	−4948.011	−4847.226	−5054.557	34.437	1.181	474	5752	61
Gdp8	280	−4948.011	−4847.226	−5054.557	34.437	2.068	901	32057	61
Gip4	0	−4948.027	−4847.234	−5054.561	34.438	0.507	164	730	59
Gip6	232	−4948.011	−4847.226	−5054.557	34.437	1.131	452	5680	61
Gip8	279	−4948.011	−4847.226	−5054.557	34.437	1.847	863	28030	61
Gll4	0	−4948.024	−4847.229	−5054.558	34.438	0.520	169	844	60
Gll6	48	−4948.017	−4847.226	−5054.557	34.431	0.858	329	1819	61
Gll8	273	−4948.011	−4847.226	−5054.557	34.437	1.217	531	6760	61
Gcomp4	0	−4948.022	−4847.229	−5054.558	34.436	0.566	190	848	62
Gcomp6	240	−4948.011	−4847.226	−5054.557	34.437	1.189	478	5752	61
Gcomp8	281	−4948.011	−4847.226	−5054.557	34.437	2.062	905	32057	61
mdp4	0	−4948.021	−4847.234	−5054.560	34.436	0.254	179	877	59
mdp6	360	−4948.008	−4847.226	−5054.556	34.437	0.461	479	5803	59
mdp8	498	−4948.008	−4847.226	−5054.556	34.437	0.735	953	32053	59
mip4	0	−4948.022	−4847.234	−5054.560	34.436	0.241	165	774	58
mip6	346	−4948.012	−4847.226	−5054.556	34.441	0.432	453	5730	59
mip8	496	−4948.008	−4847.226	−5054.556	34.437	0.690	912	28026	59
mll4	0	−4948.021	−4847.228	−5054.557	34.437	0.240	168	923	57
mll6	69	−4948.018	−4847.226	−5054.556	34.435	0.349	335	1978	59
mll8	485	−4948.008	−4847.226	−5054.556	34.437	0.495	585	6756	59
mcomp4	0	−4948.019	−4847.228	−5054.557	34.435	0.258	189	923	59
mcomp6	363	−4948.008	−4847.226	−5054.556	34.437	0.462	485	5803	59
mcomp8	500	−4948.008	−4847.226	−5054.556	34.437	0.734	958	32053	59

表2 饱和CDM生成数据, J = 16, N = 1000及4000条件下的模拟结果

N	收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
1000	Gdp6	457	−9929.201	−9801.836	−10105.797	49.742	1.057	291	1924	6
	Gdp8	487	−9929.201	−9801.836	−10105.797	49.742	1.660	508	6609	6
	Gll6	117	−9929.201	−9801.836	−10105.797	49.742	0.831	217	713	6
	Gll8	481	−9929.201	−9801.836	−10105.797	49.742	1.107	324	2512	6
	Gcomp6	457	−9929.201	−9801.836	−10105.797	49.742	1.066	295	1924	6
	Gcomp8	487	−9929.201	−9801.836	−10105.797	49.742	1.666	511	6609	6
	mdp6	460	−9929.201	−9801.836	−10105.797	49.742	0.468	288	1950	6
	mdp8	499	−9929.201	−9801.836	−10105.797	49.742	0.726	503	6628	6
	mll6	104	−9929.201	−9801.836	−10105.797	49.742	0.362	213	795	6
	mll8	494	−9929.201	−9801.836	−10105.797	49.742	0.489	323	2509	6
	mcomp6	461	−9929.201	−9801.836	−10105.797	49.742	0.471	291	1950	6
	mcomp8	500	−9929.201	−9801.836	−10105.797	49.742	0.728	507	6628	6
4000	Gdp6	469	−39831.617	−39539.020	−40187.183	102.360	2.588	223	321	0
	Gdp8	500	−39831.617	−39539.020	−40187.183	102.360	3.840	354	506	0
	Gll6	200	−39831.617	−39539.020	−40187.183	102.360	2.334	195	282	0
	Gll8	499	−39831.617	−39539.020	−40187.183	102.360	2.947	261	376	0
	Gcomp6	475	−39831.617	−39539.020	−40187.183	102.360	2.596	224	322	0
	Gcomp8	500	−39831.617	−39539.020	−40187.183	102.360	3.825	356	511	0
	mdp6	463	−39831.617	−39539.020	−40187.183	102.360	1.612	209	312	0
	mdp8	500	−39831.617	−39539.020	−40187.183	102.360	2.376	341	490	0
	mll6	177	−39831.617	−39539.020	−40187.183	102.360	1.443	182	257	0
	mll8	499	−39831.617	−39539.020	−40187.183	102.360	1.774	247	352	0
	mcomp6	471	−39831.617	−39539.020	−40187.183	102.360	1.619	211	312	0
	mcomp8	500	−39831.617	−39539.020	−40187.183	102.360	2.372	342	490	0

表2 饱和CDM生成数据, J = 16, N = 1000及4000条件下的模拟结果

N	收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
1000	Gdp6	457	−9929.201	−9801.836	−10105.797	49.742	1.057	291	1924	6
	Gdp8	487	−9929.201	−9801.836	−10105.797	49.742	1.660	508	6609	6
	Gll6	117	−9929.201	−9801.836	−10105.797	49.742	0.831	217	713	6
	Gll8	481	−9929.201	−9801.836	−10105.797	49.742	1.107	324	2512	6
	Gcomp6	457	−9929.201	−9801.836	−10105.797	49.742	1.066	295	1924	6
	Gcomp8	487	−9929.201	−9801.836	−10105.797	49.742	1.666	511	6609	6
	mdp6	460	−9929.201	−9801.836	−10105.797	49.742	0.468	288	1950	6
	mdp8	499	−9929.201	−9801.836	−10105.797	49.742	0.726	503	6628	6
	mll6	104	−9929.201	−9801.836	−10105.797	49.742	0.362	213	795	6
	mll8	494	−9929.201	−9801.836	−10105.797	49.742	0.489	323	2509	6
	mcomp6	461	−9929.201	−9801.836	−10105.797	49.742	0.471	291	1950	6
	mcomp8	500	−9929.201	−9801.836	−10105.797	49.742	0.728	507	6628	6
4000	Gdp6	469	−39831.617	−39539.020	−40187.183	102.360	2.588	223	321	0
	Gdp8	500	−39831.617	−39539.020	−40187.183	102.360	3.840	354	506	0
	Gll6	200	−39831.617	−39539.020	−40187.183	102.360	2.334	195	282	0
	Gll8	499	−39831.617	−39539.020	−40187.183	102.360	2.947	261	376	0
	Gcomp6	475	−39831.617	−39539.020	−40187.183	102.360	2.596	224	322	0
	Gcomp8	500	−39831.617	−39539.020	−40187.183	102.360	3.825	356	511	0
	mdp6	463	−39831.617	−39539.020	−40187.183	102.360	1.612	209	312	0
	mdp8	500	−39831.617	−39539.020	−40187.183	102.360	2.376	341	490	0
	mll6	177	−39831.617	−39539.020	−40187.183	102.360	1.443	182	257	0
	mll8	499	−39831.617	−39539.020	−40187.183	102.360	1.774	247	352	0
	mcomp6	471	−39831.617	−39539.020	−40187.183	102.360	1.619	211	312	0
	mcomp8	500	−39831.617	−39539.020	−40187.183	102.360	2.372	342	490	0

表3 饱和CDM生成数据, J = 32条件下的模拟结果

N	收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$
500	Gdp8	485	−9334.716	−9163.342	−9521.124	61.640	0.551	77	311
	Gll8	484	−9334.716	−9163.342	−9521.124	61.640	0.452	53	328
	Gcomp8	485	−9334.716	−9163.342	−9521.124	61.640	0.552	77	328
	mdp8	500	−9334.716	−9163.342	−9521.124	61.640	0.235	77	619
	mll8	499	−9334.716	−9163.342	−9521.124	61.640	0.203	54	609
	mcomp6	492	−9334.716	−9163.342	−9521.124	61.640	0.205	52	320
	mcomp8	500	−9334.716	−9163.342	−9521.124	61.640	0.235	77	619
1000	Gdp8	500	−18731.384	−18516.735	−19016.929	93.430	0.682	65	95
	Gll8	500	−18731.384	−18516.735	−19016.929	93.430	0.574	47	66
	Gcomp8	500	−18731.384	−18516.735	−19016.929	93.430	0.682	65	95
	mdp8	500	−18731.384	−18516.735	−19016.929	93.430	0.315	64	95
	mll8	500	−18731.384	−18516.735	−19016.929	93.430	0.266	46	66
	mcomp6	498	−18731.384	−18516.735	−19016.929	93.430	0.263	44	64
	mcomp8	500	−18731.384	−18516.735	−19016.929	93.430	0.315	64	95
4000	Gdp8	500	−75137.975	−74638.007	−75645.526	185.720	1.998	60	71
	Gll8	500	−75137.975	−74638.007	−75645.526	185.720	1.811	48	55
	Gcomp8	500	−75137.975	−74638.007	−75645.526	185.720	1.993	60	71
	mdp8	500	−75137.975	−74638.007	−75645.526	185.720	1.463	58	72
	mll8	500	−75137.975	−74638.007	−75645.526	185.720	1.210	46	56
	mcomp6	489	−75137.975	−74638.007	−75645.526	185.720	1.108	39	50
	mcomp8	500	−75137.975	−74638.007	−75645.526	185.720	1.457	58	72

表3 饱和CDM生成数据, J = 32条件下的模拟结果

N	收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$
500	Gdp8	485	−9334.716	−9163.342	−9521.124	61.640	0.551	77	311
	Gll8	484	−9334.716	−9163.342	−9521.124	61.640	0.452	53	328
	Gcomp8	485	−9334.716	−9163.342	−9521.124	61.640	0.552	77	328
	mdp8	500	−9334.716	−9163.342	−9521.124	61.640	0.235	77	619
	mll8	499	−9334.716	−9163.342	−9521.124	61.640	0.203	54	609
	mcomp6	492	−9334.716	−9163.342	−9521.124	61.640	0.205	52	320
	mcomp8	500	−9334.716	−9163.342	−9521.124	61.640	0.235	77	619
1000	Gdp8	500	−18731.384	−18516.735	−19016.929	93.430	0.682	65	95
	Gll8	500	−18731.384	−18516.735	−19016.929	93.430	0.574	47	66
	Gcomp8	500	−18731.384	−18516.735	−19016.929	93.430	0.682	65	95
	mdp8	500	−18731.384	−18516.735	−19016.929	93.430	0.315	64	95
	mll8	500	−18731.384	−18516.735	−19016.929	93.430	0.266	46	66
	mcomp6	498	−18731.384	−18516.735	−19016.929	93.430	0.263	44	64
	mcomp8	500	−18731.384	−18516.735	−19016.929	93.430	0.315	64	95
4000	Gdp8	500	−75137.975	−74638.007	−75645.526	185.720	1.998	60	71
	Gll8	500	−75137.975	−74638.007	−75645.526	185.720	1.811	48	55
	Gcomp8	500	−75137.975	−74638.007	−75645.526	185.720	1.993	60	71
	mdp8	500	−75137.975	−74638.007	−75645.526	185.720	1.463	58	72
	mll8	500	−75137.975	−74638.007	−75645.526	185.720	1.210	46	56
	mcomp6	489	−75137.975	−74638.007	−75645.526	185.720	1.108	39	50
	mcomp8	500	−75137.975	−74638.007	−75645.526	185.720	1.457	58	72

表4 HCDM生成数据, J = 16, N = 500条件下的模拟结果

收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
Gdp4	1	−4775.050	−4640.212	−4885.902	39.080	0.560	184	870	585
Gdp6	22	−4775.034	−4640.210	−4885.901	39.076	1.276	500	5131	589
Gdp8	27	−4775.033	−4640.210	−4885.901	39.075	2.175	937	23818	591
Gip4	1	−4775.051	−4640.212	−4885.904	39.081	0.543	176	795	585
Gip6	21	−4775.034	−4640.210	−4885.901	39.075	1.231	485	5141	589
Gip8	27	−4775.033	−4640.210	−4885.901	39.075	2.110	922	23818	591
Gll4	0	−4775.048	−4640.214	−4885.902	39.080	0.516	161	714	584
Gll6	12	−4775.036	−4640.210	−4885.901	39.074	0.833	308	1461	588
Gll8	25	−4775.033	−4640.210	−4885.901	39.075	1.284	535	6486	589
Gcomp4	1	−4775.048	−4640.212	−4885.902	39.080	0.574	189	870	588
Gcomp6	22	−4775.034	−4640.210	−4885.901	39.076	1.279	501	5141	589
Gcomp8	27	−4775.033	−4640.210	−4885.901	39.075	2.179	939	23818	591
mdp4	4	−4774.975	−4639.179	−4885.899	39.103	0.221	185	739	486
mdp6	350	−4774.968	−4639.178	−4885.898	39.100	0.403	475	4339	484
mdp8	469	−4774.964	−4639.178	−4885.898	39.101	0.686	931	14029	483
mip4	4	−4774.975	−4639.179	−4885.901	39.103	0.214	179	735	490
mip6	343	−4774.968	−4639.178	−4885.898	39.100	0.387	464	4303	484
mip8	469	−4774.964	−4639.178	−4885.898	39.101	0.647	916	14029	483
mll4	0	−4774.980	−4639.184	−4885.898	39.102	0.201	161	910	473
mll6	72	−4774.969	−4639.178	−4885.898	39.100	0.292	312	1471	482
mll8	458	−4774.965	−4639.178	−4885.898	39.101	0.431	558	5066	486
mcomp4	4	−4774.974	−4639.179	−4885.898	39.103	0.223	191	910	481
mcomp6	351	−4774.968	−4639.178	−4885.898	39.100	0.404	479	4339	484
mcomp8	473	−4774.964	−4639.178	−4885.898	39.101	0.684	936	14029	483

表4 HCDM生成数据, J = 16, N = 500条件下的模拟结果

收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
Gdp4	1	−4775.050	−4640.212	−4885.902	39.080	0.560	184	870	585
Gdp6	22	−4775.034	−4640.210	−4885.901	39.076	1.276	500	5131	589
Gdp8	27	−4775.033	−4640.210	−4885.901	39.075	2.175	937	23818	591
Gip4	1	−4775.051	−4640.212	−4885.904	39.081	0.543	176	795	585
Gip6	21	−4775.034	−4640.210	−4885.901	39.075	1.231	485	5141	589
Gip8	27	−4775.033	−4640.210	−4885.901	39.075	2.110	922	23818	591
Gll4	0	−4775.048	−4640.214	−4885.902	39.080	0.516	161	714	584
Gll6	12	−4775.036	−4640.210	−4885.901	39.074	0.833	308	1461	588
Gll8	25	−4775.033	−4640.210	−4885.901	39.075	1.284	535	6486	589
Gcomp4	1	−4775.048	−4640.212	−4885.902	39.080	0.574	189	870	588
Gcomp6	22	−4775.034	−4640.210	−4885.901	39.076	1.279	501	5141	589
Gcomp8	27	−4775.033	−4640.210	−4885.901	39.075	2.179	939	23818	591
mdp4	4	−4774.975	−4639.179	−4885.899	39.103	0.221	185	739	486
mdp6	350	−4774.968	−4639.178	−4885.898	39.100	0.403	475	4339	484
mdp8	469	−4774.964	−4639.178	−4885.898	39.101	0.686	931	14029	483
mip4	4	−4774.975	−4639.179	−4885.901	39.103	0.214	179	735	490
mip6	343	−4774.968	−4639.178	−4885.898	39.100	0.387	464	4303	484
mip8	469	−4774.964	−4639.178	−4885.898	39.101	0.647	916	14029	483
mll4	0	−4774.980	−4639.184	−4885.898	39.102	0.201	161	910	473
mll6	72	−4774.969	−4639.178	−4885.898	39.100	0.292	312	1471	482
mll8	458	−4774.965	−4639.178	−4885.898	39.101	0.431	558	5066	486
mcomp4	4	−4774.974	−4639.179	−4885.898	39.103	0.223	191	910	481
mcomp6	351	−4774.968	−4639.178	−4885.898	39.100	0.404	479	4339	484
mcomp8	473	−4774.964	−4639.178	−4885.898	39.101	0.684	936	14029	483

表5 HCDM生成数据, J = 16, N = 1000及4000条件下的模拟结果

N	收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
1000	Gdp6	9	−9577.383	−9408.520	−9787.279	56.515	1.547	450	5095	491
	Gdp8	12	−9577.379	−9408.520	−9787.279	56.515	2.667	843	17947	494
	Gll6	3	−9577.389	−9408.520	−9787.279	56.510	1.054	285	1685	491
	Gll8	11	−9577.385	−9408.520	−9787.279	56.509	1.558	476	5786	495
	Gcomp6	9	−9577.383	−9408.520	−9787.279	56.515	1.546	451	5095	491
	Gcomp8	12	−9577.379	−9408.520	−9787.279	56.515	2.672	844	17947	494
	mdp6	366	−9577.314	−9408.518	−9787.279	56.508	0.635	467	5512	416
	mdp8	484	−9577.313	−9408.518	−9787.279	56.508	1.171	969	18411	411
	mll6	78	−9577.319	−9408.518	−9787.279	56.503	0.410	285	1686	409
	mll8	470	−9577.319	−9408.518	−9787.279	56.503	0.647	510	5843	415
	mcomp6	370	−9577.314	−9408.518	−9787.279	56.508	0.636	469	5512	416
	mcomp8	488	−9577.313	−9408.518	−9787.279	56.508	1.173	972	18411	411
4000	Gdp6	14	−38423.227	−38076.036	−38778.783	117.696	6.011	604	3920	424
	Gdp8	23	−38423.225	−38076.036	−38778.783	117.696	10.439	1132	12509	427
	Gll6	5	−38423.228	−38076.036	−38778.783	117.696	3.937	375	2066	425
	Gll8	22	−38423.226	−38076.036	−38778.783	117.697	6.492	698	4557	425
	Gcomp6	14	−38423.227	−38076.036	−38778.783	117.696	6.082	612	3920	425
	Gcomp8	23	−38423.225	−38076.036	−38778.783	117.696	10.473	1141	12509	427
	mdp6	276	−38423.146	−38076.034	−38778.782	117.698	3.437	595	3957	356
	mdp8	473	−38423.145	−38076.034	−38778.782	117.698	6.393	1233	12714	355
	mll6	28	−38423.146	−38076.034	−38778.782	117.697	2.253	374	2076	357
	mll8	460	−38423.145	−38076.034	−38778.782	117.698	3.831	733	4569	355
	mcomp6	276	−38423.146	−38076.034	−38778.782	117.698	3.472	602	3957	356
	mcomp8	478	−38423.145	−38076.034	−38778.782	117.698	6.424	1241	12714	355

表5 HCDM生成数据, J = 16, N = 1000及4000条件下的模拟结果

N	收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
1000	Gdp6	9	−9577.383	−9408.520	−9787.279	56.515	1.547	450	5095	491
	Gdp8	12	−9577.379	−9408.520	−9787.279	56.515	2.667	843	17947	494
	Gll6	3	−9577.389	−9408.520	−9787.279	56.510	1.054	285	1685	491
	Gll8	11	−9577.385	−9408.520	−9787.279	56.509	1.558	476	5786	495
	Gcomp6	9	−9577.383	−9408.520	−9787.279	56.515	1.546	451	5095	491
	Gcomp8	12	−9577.379	−9408.520	−9787.279	56.515	2.672	844	17947	494
	mdp6	366	−9577.314	−9408.518	−9787.279	56.508	0.635	467	5512	416
	mdp8	484	−9577.313	−9408.518	−9787.279	56.508	1.171	969	18411	411
	mll6	78	−9577.319	−9408.518	−9787.279	56.503	0.410	285	1686	409
	mll8	470	−9577.319	−9408.518	−9787.279	56.503	0.647	510	5843	415
	mcomp6	370	−9577.314	−9408.518	−9787.279	56.508	0.636	469	5512	416
	mcomp8	488	−9577.313	−9408.518	−9787.279	56.508	1.173	972	18411	411
4000	Gdp6	14	−38423.227	−38076.036	−38778.783	117.696	6.011	604	3920	424
	Gdp8	23	−38423.225	−38076.036	−38778.783	117.696	10.439	1132	12509	427
	Gll6	5	−38423.228	−38076.036	−38778.783	117.696	3.937	375	2066	425
	Gll8	22	−38423.226	−38076.036	−38778.783	117.697	6.492	698	4557	425
	Gcomp6	14	−38423.227	−38076.036	−38778.783	117.696	6.082	612	3920	425
	Gcomp8	23	−38423.225	−38076.036	−38778.783	117.696	10.473	1141	12509	427
	mdp6	276	−38423.146	−38076.034	−38778.782	117.698	3.437	595	3957	356
	mdp8	473	−38423.145	−38076.034	−38778.782	117.698	6.393	1233	12714	355
	mll6	28	−38423.146	−38076.034	−38778.782	117.697	2.253	374	2076	357
	mll8	460	−38423.145	−38076.034	−38778.782	117.698	3.831	733	4569	355
	mcomp6	276	−38423.146	−38076.034	−38778.782	117.698	3.472	602	3957	356
	mcomp8	478	−38423.145	−38076.034	−38778.782	117.698	6.424	1241	12714	355

表6 HCDM生成数据, J = 32条件下的模拟结果

N	收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
500	Gcomp8	83	−8944.542	−8746.172	−9101.048	63.686	0.823	143	4521	1072
	mdp8	416	−8944.529	−8746.349	−9100.836	63.714	0.309	162	3678	936
	mll8	417	−8944.529	−8746.349	−9100.836	63.714	0.241	109	1701	916
	mcomp6	390	−8944.531	−8746.349	−9100.836	63.713	0.240	101	1575	921
	mcomp8	417	−8944.529	−8746.349	−9100.836	63.714	0.310	163	3678	936
1000	Gcomp8	44	−17941.473	−17692.040	−18203.752	96.770	1.375	179	6530	998
	mdp8	456	−17941.322	−17692.038	−18205.384	96.780	0.607	218	12877	810
	mll8	452	−17941.322	−17692.038	−18205.384	96.780	0.411	124	1840	805
	mcomp6	408	−17941.322	−17692.038	−18205.384	96.780	0.420	115	3035	809
	mcomp8	456	−17941.322	−17692.038	−18205.384	96.780	0.610	219	12877	810
4000	Gcomp8	51	−71973.595	−71443.652	−72679.347	198.161	7.854	278	7908	913
	mdp8	443	−71973.490	−71443.649	−72679.344	198.184	5.795	299	6037	714
	mll8	443	−71973.494	−71443.649	−72679.344	198.185	3.729	191	1799	706
	mcomp6	373	−71973.496	−71443.649	−72679.344	198.184	3.470	164	1833	717
	mcomp8	449	−71973.490	−71443.649	−72679.344	198.184	5.896	303	6037	714

表6 HCDM生成数据, J = 32条件下的模拟结果

N	收敛准则	$LL Best$	$LL mean$	$LL max$	$LL min$	$LL sd$	$t mean$	$Itr mean$	$Itr max$	$λ out$
500	Gcomp8	83	−8944.542	−8746.172	−9101.048	63.686	0.823	143	4521	1072
	mdp8	416	−8944.529	−8746.349	−9100.836	63.714	0.309	162	3678	936
	mll8	417	−8944.529	−8746.349	−9100.836	63.714	0.241	109	1701	916
	mcomp6	390	−8944.531	−8746.349	−9100.836	63.713	0.240	101	1575	921
	mcomp8	417	−8944.529	−8746.349	−9100.836	63.714	0.310	163	3678	936
1000	Gcomp8	44	−17941.473	−17692.040	−18203.752	96.770	1.375	179	6530	998
	mdp8	456	−17941.322	−17692.038	−18205.384	96.780	0.607	218	12877	810
	mll8	452	−17941.322	−17692.038	−18205.384	96.780	0.411	124	1840	805
	mcomp6	408	−17941.322	−17692.038	−18205.384	96.780	0.420	115	3035	809
	mcomp8	456	−17941.322	−17692.038	−18205.384	96.780	0.610	219	12877	810
4000	Gcomp8	51	−71973.595	−71443.652	−72679.347	198.161	7.854	278	7908	913
	mdp8	443	−71973.490	−71443.649	−72679.344	198.184	5.795	299	6037	714
	mll8	443	−71973.494	−71443.649	−72679.344	198.185	3.729	191	1799	706
	mcomp6	373	−71973.496	−71443.649	−72679.344	198.184	3.470	164	1833	717
	mcomp8	449	−71973.490	−71443.649	−72679.344	198.184	5.896	303	6037	714

图4 Yuan等人(2022)定义的小学数学分数运算认知属性层级关系

表7 实证数据分析结果

GDINA框架					mCDM框架
收敛准则	LL	t	Itr	$λ out$	Cov	LL	t	Itr	$λ out$
Gdp4	−14307.9718	1.040	133	4	mdp4	−14248.5465	0.470	64	1
Gdp6	−14307.9717	1.328	190	4	mdp6	−14248.5463	0.718	111	1
Gdp8	−14307.9717	1.686	247	4	mdp8	−14248.5463	0.975	158	0
Gip4	−14307.9719	0.914	123	4	mip4	−14248.5469	0.423	58	0
Gip6	−14307.9717	1.299	181	4	mip6	−14248.5463	0.670	105	1
Gip8	−14307.9717	1.631	238	4	mip8	−14248.5463	0.925	152	1
Gll4	−14307.9720	0.891	119	4	mll4	−14248.5465	0.449	63	3
Gll6	−14307.9717	1.128	148	4	mll6	−14248.5463	0.570	87	1
Gll8	−14307.9717	1.245	177	4	mll8	−14248.5463	0.698	110	2
Grl4	−14351.6261	0.264	20	4	mrl4	−14257.7213	0.168	13	0
Grl6	−14308.0450	0.448	47	4	mrl6	−14248.6033	0.289	35	1
Grl8	−14307.9725	0.856	111	4	mrl8	−14248.5469	0.415	58	0
Gcomp4	−14307.9718	1.040	133	4	mcomp4	−14248.5465	0.470	64	1
Gcomp6	−14307.9717	1.328	190	4	mcomp6	−14248.5463	0.718	111	1
Gcomp8	−14307.9717	1.686	247	4	mcomp8	−14248.5463	0.975	158	0

表7 实证数据分析结果

GDINA框架					mCDM框架
收敛准则	LL	t	Itr	$λ out$	Cov	LL	t	Itr	$λ out$
Gdp4	−14307.9718	1.040	133	4	mdp4	−14248.5465	0.470	64	1
Gdp6	−14307.9717	1.328	190	4	mdp6	−14248.5463	0.718	111	1
Gdp8	−14307.9717	1.686	247	4	mdp8	−14248.5463	0.975	158	0
Gip4	−14307.9719	0.914	123	4	mip4	−14248.5469	0.423	58	0
Gip6	−14307.9717	1.299	181	4	mip6	−14248.5463	0.670	105	1
Gip8	−14307.9717	1.631	238	4	mip8	−14248.5463	0.925	152	1
Gll4	−14307.9720	0.891	119	4	mll4	−14248.5465	0.449	63	3
Gll6	−14307.9717	1.128	148	4	mll6	−14248.5463	0.570	87	1
Gll8	−14307.9717	1.245	177	4	mll8	−14248.5463	0.698	110	2
Grl4	−14351.6261	0.264	20	4	mrl4	−14257.7213	0.168	13	0
Grl6	−14308.0450	0.448	47	4	mrl6	−14248.6033	0.289	35	1
Grl8	−14307.9725	0.856	111	4	mrl8	−14248.5469	0.415	58	0
Gcomp4	−14307.9718	1.040	133	4	mcomp4	−14248.5465	0.470	64	1
Gcomp6	−14307.9717	1.328	190	4	mcomp6	−14248.5463	0.718	111	1
Gcomp8	−14307.9717	1.686	247	4	mcomp8	−14248.5463	0.975	158	0

参考文献 44

[1]	American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.). Washington.
[2]	Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452-454. doi: 10.1038/533452a URL
[3]	Begley, C. G., & Ellis, L. M. (2012). Drug development: Raise standards for preclinical cancer research. Nature, 483 (7391), 531-533. doi: 10.1038/483531a
[4]	Chiu, C. Y., Köhn, H. F., & Ma, W. (2023). Commentary on “Extending the Basic Local Independence Model to Polytomous Data” by Stefanutti, de Chiusole, Anselmi, and Spoto. Psychometrika, 88(2), 656-671. doi: 10.1007/s11336-022-09873-7
[5]	DeCarlo, T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-Matrix. Applied Psychological Measurement, 35(1), 8-26. doi: 10.1177/0146621610377081 URL
[6]	DeCarlo, T. (2019). Insights from reparameterized DINA and beyond. In M. von Davier & Y.-S. Lee (Eds.). Handbook of diagnostic classification models (pp. 549-572). Springer.
[7]	de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115-130.
[8]	de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199. doi: 10.1007/s11336-011-9207-7 URL
[9]	Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1-22. doi: 10.1111/rssb.1977.39.issue-1 URL
[10]	Farrell, S., & Lewandowsky, S. (2018). Computational modeling of cognition and behavior. Cambridge University Press.
[11]	George, A. C., Robitzsch, A., Kiefer, T., Groß, J., & Ünlü, A. (2016). The R package CDM for cognitive diagnosis models. Journal of Statistical Software, 74(2), 1-24.
[12]	Gu, Y., & Xu, G. (2019). Learning attribute patterns in high-dimensional structured latent attribute models. Journal of Machine Learning Research, 20(115), 1-58.
[13]	Gu, Y., & Xu, G. (2020). Partial identifiability of restricted latent class models. The Annals of Statistics, 48(4), 2082-2107.
[14]	Hu, C., Wang, F., Guo, J., Song, M., Sui, J., & Peng. K. 2014). The replication crisis in psychological research. Advances in Psychological Science, 24(9), 1504-1518. doi: 10.3724/SP.J.1042.2016.01504 URL
	[胡传鹏, 王非, 过继成思, 宋梦迪, 隋洁, 彭凯平. (2016). 心理学研究中的可重复性问题: 从危机到契机. 心理科学进展, 24(9), 1504-1518.] doi: 10.3724/SP.J.1042.2016.01504
[15]	Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. doi: 10.1371/journal.pmed.0020124 pmid: 16060722
[16]	Ioannidis, J. P. A. (2008). Why most discovered true associations are inflated. Epidemiology, 19(5), 640-648. doi: 10.1097/EDE.0b013e31818131e7 pmid: 18633328
[17]	Khorramdel, L., Shin, H. J., & von Davier, M. (2019). GDM software mdltm Including parallel EM algorithm. In M. von Davier & Y.-S. Lee (Eds.), Handbook of diagnostic classification models (pp. 603-628). Springer.
[18]	Liu, R. (2018). Misspecification of attribute structure in diagnostic measurement. Educational and psychological measurement, 78(4), 605-634. doi: 10.1177/0013164417702458 pmid: 30147119
[19]	Liu, Y. (2022). Standard errors and confidence intervals for cognitive diagnostic models: Parallel bootstrap methods. Acta Psychologica Sinica, 54(6), 703-724. doi: 10.3724/SP.J.1041.2022.00703
	[刘彦楼. (2022). 认知诊断模型的标准误与置信区间估计:并行自助法. 心理学报, 54(6), 703-724.] doi: 10.3724/SP.J.1041.2022.00703
[20]	Liu, Y., Tian, W., & Xin, T. (2016). An application of M₂ statistic to evaluate the fit of cognitive diagnostic models. Journal of Educational and Behavioral Statistics, 41(1), 3-26.
[21]	Liu, Y., Xin, T., & Jiang, Y. (2022). Structural parameter standard error estimation method in diagnostic classification models: Estimation and application. Multivariate Behavioral Research, 57(5), 784-803. doi: 10.1080/00273171.2021.1919048 URL
[22]	Ma, W., & de la Torre, J. (2016). A sequential cognitive diagnosis model for polytomous responses. British Journal of Mathematical and Statistical Psychology, 69(3), 253-275. doi: 10.1111/bmsp.2016.69.issue-3 URL
[23]	Ma, W., & de la Torre, J. (2020). GDINA: An R package for cognitive diagnosis modeling. Journal of Statistical Software, 93(14), 1-26.
[24]	Ma, W., de la Torre, J., Sorrel, M., & Jiang, Z. (2022). GDINA: The generalized DINA model framework. R package version 2.9.3. https://CRAN.R-project.org/package=GDINA
[25]	Ma, W., & Guo, W. (2019). Cognitive diagnosis models for multiple strategies. British Journal of Mathematical and Statistical Psychology, 72(2), 370-392. doi: 10.1111/bmsp.12155
[26]	Ma, W., & Jiang, Z. (2021). Estimating cognitive diagnosis models in small samples: Bayes modal estimation and monotonic constraints. Applied Psychological Measurement, 45(2), 95-111. doi: 10.1177/0146621620977681 pmid: 33627916
[27]	Paek, I., & Cai, L. (2013). A comparison of item parameter standard error estimation procedures for unidimensional and multidimensional item response theory modeling. Educational and Psychological Measurement, 74(1), 58-76. doi: 10.1177/0013164413500277 URL
[28]	Paulsen, J., & Valdivia, D. S. (2022). Examining cognitive diagnostic modeling in classroom assessment conditions. The Journal of Experimental Education, 90(4), 916-933. doi: 10.1080/00220973.2021.1891008 URL
[29]	Philipp, M., Strobl, C., de la Torre, J., & Zeileis, A. (2018). On the estimation of standard errors in cognitive diagnosis models. Journal of Educational and Behavioral Statistics, 43(1), 88-115.
[30]	Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2022). CDM: Cognitive Diagnosis Modeling. R package version 8.2-6. http://CRAN.R-project.org/package=CDM
[31]	Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. Guilford Press.
[32]	Rupp, A. A., & van Rijn, P. W. (2018). GDINA and CDM packages in R. Measurement: Interdisciplinary Research and Perspectives, 16(1), 71-77. doi: 10.1080/15366367.2018.1437243 URL
[33]	Sen, S., & Terzi, R. (2020). A comparison of software packages available for dina model estimation. Applied Psychological Measurement, 44(2), 150-164. doi: 10.1177/0146621619843822 URL
[34]	Sorrel, M. A., Olea, J., Abad, F. J., de la Torre, J., Aguado, D., & Lievens, F. (2016). Validity and reliability of situational judgment test scores: A new approach based on cognitive diagnosis models. Organizational Research Methods, 19(3), 506-532. doi: 10.1177/1094428116630065 URL
[35]	Tajika, A., Ogawa, Y., Takeshima, N., Hayasaka, Y., & Furukawa, T. A. (2015). Replication and contradiction of highly cited research papers in psychiatry: 10-year follow-up. The British Journal of Psychiatry, 207(4), 357-362. doi: 10.1192/bjp.bp.113.143701 URL
[36]	Templin, J., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79(2), 317-339. doi: 10.1007/s11336-013-9362-0 pmid: 24478021
[37]	Templin, J., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32(2), 37-50. doi: 10.1111/emip.2013.32.issue-2 URL
[38]	Tian, W., Xin, T., & Kang, C. (2014). The data-augmentation techniques in item response modeling: Current approaches and new developments. Advances in Psychological Science, 22(6), 1036-1046. doi: 10.3724/SP.J.1042.2014.01036
	[田伟, 辛涛, 康春花. (2014). 项目反应理论中潜在心理特质“填补”的参数估计方法及其演变. 心理科学进展, 22(6), 1036-1046.] doi: 10.3724/SP.J.1042.2014.01036
[39]	von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287-307. doi: 10.1348/000711007X193957 URL
[40]	Wu, Z., Deloria-Knoll, M., & Zeger, S. L. (2017). Nested partially latent class models for dependent binary data; estimating disease etiology. Biostatistics, 18(2), 200-213. doi: 10.1093/biostatistics/kxw037 pmid: 27549120
[41]	Xu, X., & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data. ETS Research Report Series, 2008(1), i-18.
[42]	Yamaguchi, K. (2023). On the boundary problems in diagnostic classification models. Behaviormetrika, 50(1), 399-429. doi: 10.1007/s41237-022-00187-7
[43]	Yuan, L., Liu, Y., Chen, P., & Xin, T. (2022). Development of a new learning progression verification method based on the hierarchical diagnostic classification model: Taking grade 5 students’ fractional operations as an example. Educational Measurement: Issues and Practice, 41(3), 69-82. doi: 10.1111/emip.v41.3 URL
[44]	Zeng, Z., Gu, Y., & Xu, G. (2023). A Tensor-EM method for large-scale latent class analysis with binary responses. Psychometrika, 88(2), 580-612. doi: 10.1007/s11336-022-09887-1