ISSN 0439-755X
CN 11-1911/B

Acta Psychologica Sinica ›› 2026, Vol. 58 ›› Issue (3): 558-568.doi: 10.3724/SP.J.1041.2026.0558

• Reports of Empirical Studies • Previous Articles    

Factor retention in exploratory factor analysis based on LSTM

GUO Lei1,2, QIN Haijiang1   

  1. 1Faculty of Psychology, Southwest University;
    2Southwest University Branch, Collaborative Innovation Center of Assessment toward Basic Education Quality, Chongqing 400715, China
  • Received:2025-04-02 Published:2026-03-25 Online:2025-12-26

Abstract: Psychological research focuses on the latent traits of individuals, necessitating clear operational definitions to delineate the constructs of interest. Following this, the exploration and description of the dimensions and characteristics of these traits are essential. Exploratory Factor Analysis (EFA) is a pivotal statistical method for identifying these latent dimensions, widely utilized, especially in the development of psychological scales and instruments.
A critical aspect of employing EFA is the accurate determination of the number of factors. Underestimating the number of factors may result in the omission of theoretically significant psychological structures or sub-dimensions, leading to the loss of critical information, increased estimation errors in factor loadings, and diminished accuracy of factor scores. Conversely, overestimating the number of factors may lead to factors splitting, where the primary loadings of manifest variables are dispersed across multiple factors, thereby weakening the association between the manifest variables and the intended factor. Moreover, this may result in a model characterized by undue complexity and structures of limited practical or theoretical utility. To address these challenges, researchers have proposed various methods, including the Kaiser criterion (i.e., eigenvalues greater than one), the empirical Kaiser criterion, Parallel Analysis, the Hull method, Comparison Data, Factor Forest, and Comparison Data Forest. With the rapid advancement of machine learning, its application in EFA has begun to attract attention. This study introduces an innovative approach by treating eigenvalues as sequential data and leveraging Long Short-Term Memory (LSTM) networks to construct a predictive model. The performance of the LSTM-based method was subsequently evaluated through extensive simulations and empirical studies under diverse data conditions, demonstrating its robustness and applicability.
The findings of the study indicate that: (1) After hyperparameter tuning, an optimal combination was identified, enabling the LSTM model to achieve excellent performance across accuracy, precision, and other evaluation metrics, demonstrating high classification capability. (2) In the simulation study, the LSTM model significantly outperformed Comparison Data Forest, the Empirical Kaiser Criterion, and Parallel Analysis under nearly all data conditions, with an average improvement in estimation accuracy of 48.50% and a maximum improvement of 171.09%.
Furthermore, an empirical study was conducted using data from a parental psychological control scale administered to a cohort of 987 high school students in a city in 2022. Both traditional methods and the LSTM approach were employed to assess ecological validity. The results demonstrated that the LSTM provided the most accurate estimation of the number of factors, while the CDF method exhibited a significant tendency to overestimate. Overall, the LSTM proposed in this study demonstrates strong practical value and is worthy of broader adoption. Researchers can use the R package LSTMfactors to call the LSTM trained in this study to analyze empirical data.

Key words: exploratory factor analysis, LSTM, factor retention, deep learning

CLC Number: