ISSN 1671-3710
CN 11-4766/R

心理科学进展 ›› 2023, Vol. 31 ›› Issue (4): 507-518.doi: 10.3724/SP.J.1042.2023.00507

• 研究方法 •    下一篇


方俊燕1, 温忠麟2()   

  1. 1广州体育学院休闲体育与管理学院, 广州 510500
    2华南师范大学心理学院/心理应用研究中心, 广州 510631
  • 收稿日期:2022-06-21 出版日期:2023-04-15 发布日期:2022-12-30
  • 通讯作者: 温忠麟
  • 基金资助:

The endogeneity issue in longitudinal research: Sources and solutions

FANG Junyan1, WEN Zhonglin2()   

  1. 1School of Leisure Sports and Management, Guangzhou Sport University, Guangzhou 510500, China
    2School of Psychology/Center for Studies of Psychological Application, South China Normal University, Guangzhou 510631, China
  • Received:2022-06-21 Online:2023-04-15 Published:2022-12-30
  • Contact: WEN Zhonglin


相对于横断研究, 追踪研究中更有可能同时存在多种内生性问题来源。双变量追踪研究在心理学因果分析中发挥了重要的作用, 然而其中的内生性问题却未得到应有的关注, 这可能会影响推论的准确性。追踪研究中内生性问题的来源视乎模型而定, 主要包括遗漏变量、变量选择和样本选择、解释变量的测量误差、动态面板和变量之间的相互关系。本文以代表性追踪模型CLPM为例, 展示了内生性问题的影响, 讨论了在原模型中运用工具变量来建模以应对内生性问题的可行性, 目的是使心理学研究者能够关注追踪研究中的内生性问题, 更好地运用追踪模型进行因果分析。

关键词: 内生性问题, 追踪研究, 交叉滞后面板模型, 工具变量


Longitudinal cross-lagged models have been widely used to analyze causal relations in behavioral and psychological sciences, and the cross-lagged panel model (CLPM) is common and important. The CLPM and related panel data models usually consist of two kinds of regression relations: (a) the autoregressive effects of a variable from its prior state (an earlier time point) to its current state (present time point) and (b) the cross-lagged effects of one variable at the prior state to another variable at the current state. These effects between the two constructs provide the foundation and crucial information in deciding on their diachronic causation. Importantly, in contrast to general regression models, a CLPM consists of a complex set of regression equations, making it more susceptible to endogeneity-related problems.
Endogeneity is a critical concern in regression analyses, which refers to situations when an explanatory variable is correlated with the residual (error) of its regression equation. It will likely lead to over- or under-estimated bias with commonly used estimators. Endogeneity is a critical concern when using regression models to analyze observational data to make causal claims. The CLPM determines diachronic causation based on two kinds of regression effects (autoregressive and cross-lagged paths). Despite its vulnerability to endogeneity, this issue has received little attention and requires systematic analyses.
The current study focuses on issues related to endogeneity in the CLPM. We first clarify the main sources of endogeneity problems. Then, we systematically analyze different endogeneity issues in the CLPM. Lastly, we provide an empirical example to illustrate the use of the instrumental variables (IV) method in the CLPM.
IV originates from econometrics and refers to the predictor of a predictor. The IV method is probably the most popular approach while dealing with endogeneity. Researchers often incorporate suitable IVs in the model to provide unbiased estimates and alleviate the endogeneity concerns. The model implied IVs (MIIVs) have been frequently used in empirical studies. A MIIV is an IV identified within the model. The MIIVs offer a promising way to deal with endogeneity in longitudinal analyses. Typically, a MIIV is a chronologically prior observation of an exogenous variable in the model. Currently, applications with IV are underutilized in psychological research. This paper tries to illustrate the use of MIIV in the CLPM by an empirical example.
To our knowledge, this is the first study to discuss endogeneity issues in the CLPM and explore the performances of MIIV in the longitudinal cross-lagged model. We find that common possible sources of endogeneity in the CLPM are: omitted variables, dynamic panel, and reciprocal relation. The omitted variables are ubiquitous in all empirical research and the omitted variable problem will affect the estimation of cross-lagged analyses. For the dynamic panel, “dynamic” refers to the use of the prior outcome as a predictor. Including the effects of this lagged outcome increases the probability of the explanatory variable being related to the residual. Besides, biases could arise from the reciprocal relation, which is also known as the feedback relation, simultaneity, or simultaneous causality.
We conclude, first, there are various types of endogeneity in the CLPM, including the omitted variables, dynamic panel, and reciprocal relation. Second, endogeneity could distort the estimation of cross-lagged effects in the CLPM. Lastly, MIIV is a promising technique to tackle endogeneity issues in the CLPM. For future research, it would be interesting to explore the performance of MIIV in models extended from the CLPM. They may include the Random Intercept-CLPM, the Latent Cure Model with Structured Residuals (LCM-SR), and the Latent Change Score Model (LCS).
This paper reviews the main sources of endogeneity in the CLPM to raise applied researchers’ awareness of the endogeneity issues in longitudinal research. We recommend the MIIV-CLPM as a solution to deal with the endogeneity issue.

Key words: endogeneity, longitudinal research, cross-lagged panel model, instrumental variables