ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2026, Vol. 58 ›› Issue (4): 773-792.doi: 10.3724/SP.J.1041.2026.0773 cstr: 32110.14.2026.0773

• 研究报告 • 上一篇    

密集追踪干预研究设计中的建模及其样本量规划——基于动态结构方程模型

刘玥1, 何月翎1, 刘红云2,3   

  1. 1四川师范大学脑与心理科学研究院, 成都 610066;
    2应用实验心理北京市重点实验室;
    3北京师范大学心理学部, 北京 100875
  • 收稿日期:2025-02-26 发布日期:2026-01-16 出版日期:2026-04-25
  • 通讯作者: 刘红云, E-mail: hyliu@bnu.edu.cn
  • 基金资助:
    国家自然科学基金项目(32200920,32471145),四川省哲学社会科学基金青年人才项目(SCJJ25QN17)

Data analysis and sample size planning for intensive longitudinal intervention studies using dynamic structural equation modeling

LIU Yue1, HE Yueling1, LIU Hongyun2,3   

  1. 1Institute of Brain and Psychological Sciences, Sichuan Normal University, Chengdu 610066, China;
    2Beijing Key Laboratory of Applied Experimental Psychology, Beijing Normal University, Beijing 100875, China;
    3Faculty of Psychology, Beijing Normal University, Beijing 100875, China
  • Received:2025-02-26 Online:2026-01-16 Published:2026-04-25

摘要: 密集追踪干预研究具有生态效度高、能够提供实时和个性化干预等优势。然而, 目前常用的数据分析方法未能充分反映密集追踪数据的特点, 而先进的数据分析模型又缺乏与之匹配的样本量规划方法, 极大地限制了这种范式的推广应用。本文在两种典型的密集追踪干预实验设计——单臂设计和随机对照设计下, 基于动态结构方程模型, 结合检验力和效应量估计准确性, 采用模拟研究方法开展样本量规划, 并从第一类错误率等方面对两种设计进行综合比较, 最后提出了实验设计和样本量规划建议。

关键词: 密集追踪干预, 动态结构方程模型, 检验力分析, 效应量, 样本量规划

Abstract: Intensive longitudinal interventions (ILIs) have emerged as powerful tools for understanding, treating and preventing mental and behavioral disorders. However, most existing ILI studies rely on traditional analytic methods such as ANOVA or linear mixed models, which overlook both individual differences and the inherent autocorrelation structure of time-series data. Moreover, intervention effects are often evaluated only through changes in the mean level of key variables (e.g., anxiety). This study demonstrates how dynamic structural equation modeling (DSEM) can be used to analyze ILI data and evaluate intervention effects across three dimensions—mean, autoregression, and intra-individual variability (IIV)—for two types of intervention designs: single-arm trial (SAT) and randomized controlled trial (RCT). We conducted two simulation studies to examine sample size recommendations for DSEM-based ILI studies, considering both statistical power and accuracy in parameter estimation (AIPE). In a third simulation, we compared the type I error rates of SAT and RCT designs when natural temporal changes occurred in the control group. Finally, we illustrated sample size planning using empirical data from a pre-ILI study targeting appearance anxiety reduction.
Simulation Studies 1 and 2 investigated the power and AIPE across varying sample sizes, as well as the required sample size for both SAT and RCT designs. The effect sizes of intervention effects for mean, autoregression and IIV were fixed at the medium level. Two factors regarding sample size were manipulated: number of participants (N = 30, 60, 100,150, 200, 300,400), number of time-points (T = 10, 20, 40, 60, 80, 100). The data-generating models and fitted models were identical, with analysis conducted using Mplus 8.10 and Bayesian estimation. Model performance was assessed in terms of convergence rate, power and AIPE for intervention effects, as well as bias in the standard errors of the intervention effects. Simulation Study 3 assessed the type I error rate for both designs when changes in the control group were different from zero, indicating a change (on average) due to time. Last, the empirical study conducted sample size planning based on a pre-study aimed at reducing appearance anxiety using an ILI design.
The results are as following. First, all simulation conditions achieved satisfactory convergence. Second, statistical power increased and credible interval width decreased with larger N or T. However, a minimum of 60 participants was required to achieve adequate power (i.e., 0.8). The relative bias in intervention effect was generally small. Except in the SAT design, the intervention effects on autoregression and IIV were underestimated when the number of time-points was low (T = 10 or 20). While in the RCT design, the intervention effect on mean was underestimated when sample size in both levels were small (N = 30 or 60, T = 10). Bias in the standard error was also negligible across conditions. Third, a credible interval width contours plot were useful for determining sample size under both power- and AIPE-based criteria. were useful for determining sample size under both power- and AIPE-based criteria. Fourth, when natural mean-level changes occurred between pre- and post-intervention phases, the SAT design exhibited inflated type I error rates compared to the RCT design, especially with larger samples.
In conclusion, DSEM provides a flexible framework for analyzing ILI data by simultaneously capturing intervention effects on mean, autoregression, and IIV. Researchers should choose between SAT and RCT designs based on theoretical and practical considerations: RCTs offer stronger control for time-related confounds but require larger samples, whereas SATs are more suitable for small-sample or preliminary studies. For Monte Carlo-based sample size planning, accurate specification of true parameter values is critical; these should be derived from pre-studies, similar empirical data, or meta-analytic evidence whenever possible. When such information is unavailable, the procedures described in this study offer practical guidance.

Key words: intensive longitudinal intervention, dynamic structural equation modeling, power analysis, effect size, sample size planning