ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2017, Vol. 49 ›› Issue (5): 699-710.doi: 10.3724/SP.J.1041.2017.00699

• 论文 • 上一篇    

LGM模型中缺失数据处理方法的比较: ML方法与Diggle-Kenward选择模型

张杉杉1; 陈 楠2,3; 刘红云2   

  1. (1首都经济贸易大学劳动经济学院, 北京 100070) (2北京师范大学心理学院应用实验心理北京市重点实验室, 北京 100875) (3艾美仕市场调研咨询(上海)有限公司, 北京 100005)
  • 收稿日期:2016-05-27 出版日期:2017-05-25 发布日期:2017-05-25
  • 通讯作者: 刘红云, E-mail: hyliu@bnu.edu.cn
  • 基金资助:

    国家自然科学基金项目(31571152)、北京市与中央在京高校共建项目(019-105812)、未来教育高精尖创新中心、中央高校基本科研业务费专项资金资助。

LGM-based analyses with missing data: Comparison between ML method and Diggle-Kenward selection model

ZHANG Shanshan1; CHEN Nan2,3; LIU Hongyun2   

  1. (1 School of Labor Economics, Capital University of Economics and Business, Beijing, 100070, China) (2 Beijing Key Laboratory of Applied Experimental Psychology, School of Psychology, Beijing Normal University, Beijing, 100875, China) (3 QuintilesIMS Incorporated, Beijing, 100005, China)
  • Received:2016-05-27 Online:2017-05-25 Published:2017-05-25
  • Contact: LIU Hongyun, E-mail: hyliu@bnu.edu.cn

摘要:

追踪研究中缺失数据十分常见。本文通过Monte Carlo模拟研究, 考察基于不同前提假设的Diggle-Kenward选择模型和ML方法对增长参数估计精度的差异, 并考虑样本量、缺失比例、目标变量分布形态以及不同缺失机制的影响。结果表明:(1)缺失机制对基于MAR的ML方法有较大的影响, 在MNAR缺失机制下, 基于MAR的ML方法对LGM模型中截距均值和斜率均值的估计不具有稳健性。(2) Diggle- Kenward选择模型更容易受到目标变量分布偏态程度的影响, 样本量与偏态程度存在交互作用, 样本量较大时, 偏态程度的影响会减弱。而ML方法仅在MNAR机制下轻微受到偏态程度的影响。

关键词: 潜变量增长模型, 非随机缺失机制, Diggle-Kenward选择模型, 极大似然方法

Abstract:

In longitudinal studies, missing data are common. The missing not at random (MNAR) data may lead to biasd parameter estimates and even distort the results of analyses. In this article we compared two techniques based on different mechanisms [i.e., the maximum likelihood approach based on the Missing at Random (MAR) mechanism and the Diggle-Kenward selection model based on the MNAR mechanism] for handling different types of missing data using the Monte Carlo simulation method. Estimates of parameters and standard errors using each of these methods were contrasted under different model assumptions. Four possible influential factors were considered: the dropout missingness proportions, the sample size, the distribution shape (i.e., skewness and kurtosis), and the missing mechanisms. The results indicated that (1) The Diggle-Kenward selection model were affected less by the missingness mechanism than the ML approach. At the MAR condition, the Diggle-Kenward selection model based on the MNAR mechanism kept stable and would provide similar estimation results with the ML approach based on the MAR assumption. At the MNAR condition, the ML approach was not much different from the Diggle-Kenward selection model in their variance of latent variances (σi2 and σs2) but had greater discrepancy in their means of the latent variables (μi and μs). (2) The distribution shape had more impact on the Diggle-Kenward selection model. For the mean and variance of the intercept and the variance of the slope, the sample size and the degrees of skewness and kurtosis had significant interactions. With large sample sizes, the influence of distribution shape on the estimation precision would decrease. The ML approach was not easily affected by the distribution shape. (3) When fitting a growth curve model, compared to the means of the latent variables (μi and μs), the variances (σi2 and σs2) were influenced much more by the distribution shape (i.e., the degree of skewness and kurtosis). (4) The level of dropout missingness proportion was the major factor affecting the parameter estimation precision. Greater sample size would improve the estimation precision in most cases.

Key words: latent growth model, missing not at random, Diggle-Kenward selection model, maximum likelihood approach