ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

心理科学进展 ›› 2022, Vol. 30 ›› Issue (8): 1682-1691.doi: 10.3724/SP.J.1042.2022.01682

• 国内心理统计方法研究热点回顾 • 上一篇    下一篇

新世纪20年国内测验信度研究

温忠麟1(), 陈虹熹1, 方杰2, 叶宝娟3, 蔡保贞1   

  1. 1华南师范大学心理学院/心理应用研究中心, 广州 510631
    2广东财经大学新发展研究院/应用心理学系, 广州 510320
    3江西师范大学心理学院/心理健康教育研究中心, 南昌 330022
  • 收稿日期:2021-12-29 出版日期:2022-08-15 发布日期:2022-06-23
  • 通讯作者: 温忠麟 E-mail:wenzl@scnu.edu.cn
  • 基金资助:
    国家自然科学基金项目(32171091)

Research on test reliability in China’s mainland from 2001 to 2020

WEN Zhonglin1(), CHEN Hongxi1, FANG Jie2, YE Baojuan3, CAI Baozhen1   

  1. 1School of Psychology & Center for Studies of Psychological Application, South China Normal University, Guangzhou 510631, China
    2Institute of New Development & Department of Applied Psychology, Guangdong University of Finance & Economics, Guangzhou 510320, China
    3School of Psychology & Center of Mental Health Education and Research, Jiangxi Normal University, Nanchang 330022, China
  • Received:2021-12-29 Online:2022-08-15 Published:2022-06-23
  • Contact: WEN Zhonglin E-mail:wenzl@scnu.edu.cn

摘要:

随着验证性因子分析模型的应用, 信度研究进入了崭新的发展阶段。新世纪前20年国内有关测验信度的研究有三条发展主线。一是基于验证性因子模型的信度发展, 包括同质性系数、合成信度、最大信度等; 二是数据类型的拓展, 包括两水平和追踪数据的信度; 三是信度用途的拓展, 如评分者信度、编码者信度等。对于通常的测验(题目之间的测量误差不相关), 如果α系数够高, 信度就够高; 否则使用合成信度。如果一个统计模型中所有变量的合成信度都很高(超过0.95), 使用显变量建模与使用潜变量建模的结果差别不大; 否则, 使用潜变量建模较好。

关键词: 信度, α系数, 同质性系数, 合成信度, 区间估计

Abstract:

With the application of confirmatory factor analysis, research on reliability has entered a new stage. In the first two decades of the 21st century, the studies on test reliability (including point estimation and interval estimation) in China’s mainland show three main lines of development.

The first line is the development from research centered on the coefficient αto the reliability research based on confirmatory factor models, including the homogeneity coefficient, composite reliability, maximum reliability, single-indicator reliability and reliability of the whole item set scores. Studies have shown that the coefficient αis still useful. In most cases, the α coefficient is the lower bound of the reliability of the composite score (total or average score). As long as the coefficient αis high enough, the test reliability will be even higher. But the coefficient αcannot be used to measure the homogeneity and the internal consistency of a test. The homogeneity coefficient based on the bi-factor model can be adopted to measure the homogeneity of a multidimensional scale, and the composite reliability can be adopted to measure the internal consistency (if consistency is understood as the consistency within each dimension). Furthermore, the Delta method can be employed to estimate the confidence intervals of various reliability.

The second line is the expansion of data types collected by scales (or questionnaires), from single-level data to multi-level and longitudinal data. Whether unidimensional or multidimensional, it is recommended to use a multi-level confirmatory factor model to calculate the reliability of multi-level data. As for the longitudinal data, it is recommended to use the test reliability developed on the basis of the linear mixed model, and the longitudinal data can also be used as a special case of the two-level data for reliability analysis.

The third line is the extended use of reliability, involving rater reliability, encoder reliability, attribute-level classification consistency in cognitive diagnostic assessment, and reliability of difference scores. In addition, research of reliability generalization and reliability meta-analysis appeared.

For a common test with item-errors that can be reasonably assumed uncorrelated, the following procedure of reliability analysis is recommended. When the coefficient αis high enough, report the coefficient α; otherwise calculate the composite reliability on the basis of the factor model. If the composite reliability is high enough, report the composite reliability; otherwise the test reliability is considered unacceptable.

If the composite reliability of every variable in a statistical model is very high (over 0.95), modeling with composite scores does not differ much from modeling with latent variables. Otherwise, it is better to use latent variable modeling.

Key words: reliability, coefficient α, homogeneity coefficient, composite reliability, interval estimation

中图分类号: