ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

Advances in Psychological Science ›› 2025, Vol. 33 ›› Issue (9): 1558-1574.doi: 10.3724/SP.J.1042.2025.1558

• Conceptual Framework • Previous Articles     Next Articles

Optimization of data collection plans and improvement of data analysis methods for intensive longitudinal studies

LIU Hongyun1, DOU Jianing1, XU Yongze2()   

  1. 1Beijing Key Laboratory of Applied Experimental Psychology, National Demonstration Center for Experimental Psychology Education (Beijing Normal University), Faculty of Psychology, Beijing Normal University, Beijing 100875, China
    2Department of Psychology, Faculty of Arts and Sciences, Beijing Normal University at Zhuhai, Zhuhai 519085, China
  • Received:2024-03-26 Online:2025-09-15 Published:2025-06-26
  • Contact: XU Yongze E-mail:yzxu@bnu.edu.cn

Abstract:

Intensive longitudinal studies (ILSs) have recently become increasingly popular in fields such as psychology, medicine, and health sciences. With the advantages of low recall bias and high ecological validity, these studies help researchers gain further insight into the dynamic processes and complex interplay of individual states. However, research has shown that increasing assessment intensity can lead to increased participant burden, poorer compliance, reduced intra-individual variability in state variables, altered relations between variables, and more careless responses. Therefore, designing an ILS requires a reasonable trade-off between the goal of collecting more information and the risk of high assessment intensity.

Inspired by the idea of planned missing design (PMD) in cross-sectional studies, this research aims to explore effective ways to improve data quality and study efficiency by optimizing data collection plans and improving the method of missing data handling in intensive longitudinal studies. Focusing on ILSs with PMDs, this research will conduct three methodological studies and one applied study within the framework of dynamic structural equation modeling (DSEM). In Study 1, we will first design multiple schemes for types of PMDs and then conduct Monte Carlo simulation studies to compare the performance of different schemes under various conditions. Finally, we will offer practical advice on better selecting and applying PMDs in ILSs. In Study 2, we will first present a standard procedure to recommend the sample size for PMDs. Then, we will highlight a surrogate modeling framework based on machine learning predictions to optimize sample size planning. Finally, we will develop a user-friendly and accessible application for power analysis and sample size calculation. In Study 3, we will first propose a new method for handling missing data by combining factored regression specification with Bayesian estimation. Then, we will conduct Monte Carlo simulation studies to compare the performance of the proposed method with three existing methods under different missingness mechanisms. Finally, we will develop a software package and offer practical recommendations on selecting missing data handling methods. In Study 4, we will conduct an empirical study to demonstrate how to develop appropriate measurement protocols under each type of PMD, how to determine the sample size for different missing pattern data using the optimization application, and how to appropriately handle missing data, perform data analyses, and interpret the results.

The innovations of this study will be primarily reflected in four key areas, forming a cohesive framework that integrates theoretical development, methodological integration, and practical application. First, the study will demonstrate clear innovation in the design of intensive longitudinal data collection schemes. It will introduce the concept of PMD as a systematic solution aimed at reducing participant burden and improving data quality from the outset of research design. This will not only enrich the theoretical foundation of data collection strategies but also enhance the efficiency and feasibility of empirical research. Second, in terms of interdisciplinary integration and methodological expansion, the study will address the computational challenges and optimization difficulties often encountered in sample size planning for ILS. It will propose a novel technical approach that combines machine learning prediction models, search algorithms, and DSEM. This integration will offer new solutions for optimizing sample size planning while promoting the convergence of cutting-edge techniques across disciplines, thereby extending the methodological boundaries of traditional quantitative research in psychology. Third, in the domain of missing data handling, the study will aim to extend the factored regression specifications approach and effectively integrate it with Bayesian estimation. This combined method will provide a more practical solution for planned missing intensive longitudinal designs that involve high missingness rates, complex missing mechanisms, small sample sizes, and large numbers of variables. Finally, on the practical application level, the study will emphasize the evaluation of PMD effectiveness and its influencing factors, and will develop practical tools for sample size planning and missing data analysis. These tools will offer applied researchers new approaches for calculating sample size and statistical power, bridging the gap between advanced methodological development and real-world research needs.

In summary, this research will not only enrich and innovate the theoretical understanding and methodological guidance for intensive longitudinal data collection, but will also provide a practical basis and hands-on tools for the application of emerging technologies and cutting-edge methods in various fields. It will lay a solid foundation and open new avenues for innovation in future studies across related fields.

Key words: longitudinal data analysis, structural equation modeling, missing data analysis, planned missing design, sample size planning

CLC Number: