Please wait a minute...
Advances in Psychological Science    2020, Vol. 28 Issue (8) : 1392-1408     DOI: 10.3724/SP.J.1042.2020.01392
Research Method |
Data aggregation adequacy testing in multilevel research: A critical literature review and preliminary solutions to key issues
ZHU Haiteng
Department of Military and Ideological Basic Education, PLA Army Academy of Artillery and Air Defense, Hefei 230031, China
Download: PDF(659 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  The measurement of shared unit property constructs is ubiquitous in multilevel organizational research, of which the most frequently used approach is to aggregate the ratings of several unit members to the unit level. The data aggregation adequacy testing (DAAT) is a statistical hurdle to ensure the validity and representativeness of aggregated scores. Well-established indicators of DAAT include within-group agreement index, rWG, and within-group reliability indices, ICC(1) and ICC(2); nonetheless, some key issues are still open to debate, for instance, the superiority of the two families of indicators, the null distribution and data screening decision of rWG, and appropriate cut-off values. To address the above questions, the current research firstly conducted a content analysis of 166 studies adopting DAAT procedure published on 9 Chinese journals in the field of management and psychology since 2014, coupled with 85 studies from Journal of Applied Psychology as a comparison. Common problems in routine practice of DAAT were identified and related suggestions were proposed as follows: (1) Disentangling and differentiating the role of DAAT indicators; specifically, rWG should be used as the exclusive indicator of aggregation adequacy, whereas ICC(1) and ICC(2) should be deemed as indices of validity and reliability, respectively. (2) Making prudent and justifiable decisions in choosing null distributions when calculating rWG index, and excluding groups with low within-group agreement. (3) Applying more reasonable and moderately flexible cut-off values instead of arbitrary and rough practical standards. Last but not the least, researchers should always prioritize theoretical considerations in the process of framework building and DAAT, and unload disproportionate dependence on statistical results.
Keywords multilevel research      shared unit property      aggregation      within-group agreement      within-group reliability     
PACS:  B841  
Issue Date: 28 June 2020
E-mail this article
E-mail Alert
Articles by authors
ZHU Haiteng
Cite this article:   
ZHU Haiteng. Data aggregation adequacy testing in multilevel research: A critical literature review and preliminary solutions to key issues[J]. Advances in Psychological Science, 2020, 28(8): 1392-1408.
URL:     OR
[1] 毕向阳. (2019). 基于多水平验证性因子分析的城市社区社会资本测量——实例研究及相关方法综述.社会学研究, (6), 213-237.
[2] 邓今朝, 喻梦琴, 丁栩平. (2018). 员工建言行为对团队创造力的作用机制.科研管理, 39(12), 171-178.
[3] 方杰, 邱皓政, 张敏强. (2011). 基于多层结构方程模型的情境效应分析——兼与多层线性模型比较.心理科学进展, 19(2), 284-292.
[4] 方杰, 张敏强, 邱皓政. (2010). 基于阶层线性理论的多层级中介效应.心理科学进展, 18(8), 1329-1338.
[5] 韩志伟, 刘丽红. (2019). 团队领导组织公民行为的有效性: 以双维认同为中介的多层次模型检验.心理科学, 42(1), 137-143.
[6] 蒋丽, 李永娟, 田晓明. (2012). 气氛强度: 理论基础及其研究框架.心理科学, 35(6), 1466-1473.
[7] 李敏, 周恋. (2015). 基于工会直选调节作用的劳动关系氛围、心理契约破裂感知和工会承诺的关系研究.管理学报, 12(3), 364-371.
[8] 廖卉, 庄瑷嘉. (2012). 多层次理论模型的建立及研究方法. 见陈晓萍, 徐淑英, 樊景立(编), 组织与管理研究的实证方法(第二版) (pp. 442-476). 北京: 北京大学出版社.
[9] 林钲棽, 彭台光. (2006). 多层次管理研究: 分析层次的概念、理论和方法.管理学报(台), 23(6), 649-675.
[10] 罗胜强, 姜嬿. (2014). 管理学问卷调查研究方法. 重庆: 重庆大学出版社.
[11] 吕洁, 张钢. (2015). 知识异质性对知识型团队创造力的影响机制: 基于互动认知的视角.心理学报, 47(4), 533-544.
[12] 马君, 张昊民, 杨涛. (2015). 绩效评价、成就目标导向对团队成员工作创新行为的跨层次影响.管理工程学报, 29(3), 62-71.
[13] 田雪垠, 郑蝉金, 郭少阳, 贺冠瑞. (2019). 基于多层验证性因素分析的各种信度系数方法.心理学探新, 39(5), 461-467.
[14] 王孟成, 毕向阳. (2018). 潜变量建模与Mplus应用·进阶篇. 重庆: 重庆大学出版社.
[15] 温福星, 邱皓政. (2015). 多层次模式方法论: 阶层线性模式的关键问题与试解. 北京: 经济管理出版社.
[16] 辛自强. (2018). 心理学研究方法新进展. 北京: 北京师范大学出版社.
[17] 徐晓锋, 刘勇. (2007). 评分者内部一致性的研究和应用.心理科学, 30(5), 1175-1178.
[18] 杨建锋, 王重鸣. (2008). 类内相关系数的原理及其应用.心理科学, 31(2), 434-437.
[19] 于海波, 方俐洛, 凌文辁. (2004). 组织研究中的多层面问题.心理科学进展, 12(2), 462-471.
[20] 张勇, 龙立荣, 贺伟. (2014). 绩效薪酬对员工突破性创造力和渐进性创造力的影响.心理学报, 46(12), 1880-1896.
[21] 张志学. (2010). 组织心理学研究的情境化及多层次理论.心理学报, 42(1), 10-21.
[22] Bartko, J. J. (1976). On various intraclass correlation reliability coefficients.Psychological Bulletin, 83(5), 762-765.
[23] Biemann T., Cole M. S., & Voelpel S. (2012). Within- group agreement: On the use (and misuse) ofrWG and rWG(J) in leadership research and some best practice guidelines. The Leadership Quarterly, 23(1), 66-80.
[24] Bliese, P. D. (1998). Group size, ICC values, and group-level correlations: A simulation.Organizational Research Methods, 1(4), 355-373.
[25] Bliese, P. D. (2000). Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. In K. J. Klein & S. W. J. Kozlowski (Eds.), Multilevel theory, research, and methods in organizations(pp. 349-381). San Francisco: Jossey-Bass.
[26] Bliese P. D., Maltarich M. A., Hendricks J. L., Hofmann D. A., & Adler A. B. (2019). Improving the measurement of group-level constructs by optimizing between-group differentiation.Journal of Applied Psychology, 104(2), 293-302.
[27] Brown, R. D., & Hauenstein, N. M. A. (2005). Interrater agreement reconsidered: An alternative to the rwg indices. Organizational Research Methods, 8(2), 165-184.
[28] Burke M. J., Cohen A., Doveh E., & Smith-Crowe K. (2018). Central tendency and matched difference approaches for assessing interrater agreement.Journal of Applied Psychology, 103(11), 1198-1229.
[29] Carron A. V., Brawley L. R., Eys M. A., Bray S., Dorsch K., Estabrooks P., … Terry P. C. (2003). Do individual perceptions of group cohesion reflect shared beliefs? An empirical analysis.Small Group Research, 34(4), 468-496.
[30] Castro, S. L. (2002). Data analytic methods for the analysis of multilevel questions: A comparison of intraclass correlation coefficients,rwg(j), hierarchical linear modeling, within- and between-analysis, and random group resampling. The Leadership Quarterly, 13(1), 69-93.
[31] Chan, D. (1998). Functional relations among constructs in the same content domain at different levels of analysis: A typology of composition models.Journal of Applied Psychology, 83(2), 234-246.
[32] Cohen A., Doveh E., & Eick U. (2001). Statistical properties of therWG(J) index of agreement. Psychological Methods, 6(3), 297-310.
[33] Cohen A., Doveh E., & Nahum-Shani I. (2009). Testing agreement for multi-item scales with the indicesrWG(J) and ADM(J). Organizational Research Methods, 12(1), 148-164.
[34] Dixon, M. A., & Cunningham, G. B. (2006). Data aggregation in multilevel analysis: A review of conceptual and statistical issues.Measurement in Physical Education and Exercise Science, 10(2), 85-107.
[35] Dunlap W. P., Burke M. J., & Smith-Crowe K. (2003). Accurate tests of statistical significance forrWG and average deviation interrater agreement indexes. Journal of Applied Psychology, 88(2), 356-362.
[36] Dyer N. G., Hanges P. J., & Hall R. J. (2005). Applying multilevel confirmatory factor analysis techniques to the study of leadership.The Leadership Quarterly, 16(1), 149-167.
[37] Farmer S. M., Van Dyne L., & Kamdar D. (2015). The contextualized self: How team-member exchange leads to coworker identification and helping OCB.Journal of Applied Psychology, 100(2), 583-595.
[38] Geldhof G. J., Preacher K. J., & Zyphur M. J. (2014). Reliability estimation in a multilevel confirmatory factor analysis framework.Psychological Methods, 19(1), 72-91.
[39] George, J. M., & James, L. R. (1993). Personality, affect, and behavior in groups revisited: Comment on aggregation, levels of analysis, and a recent application of within and between analysis.Journal of Applied Psychology, 78(5), 798-804.
[40] Glick, W. H. (1985). Conceptualizing and measuring organizational and psychological climate: Pitfalls in multilevel research.Academy of Management Review, 10(3), 601-616.
[41] González-Romá, V. (2019). Three issues in multilevel research.The Spanish Journal of Psychology, 22(e4), 1-7.
[42] James, L. R. (1982). Aggregation bias in estimates of perceptual agreement.Journal of Applied Psychology, 67(2), 219-229.
[43] James L. R., Demaree R. G., & Wolf G. (1984). Estimating within-group interrater reliability with and without response bias.Journal of Applied Psychology, 69(1), 85-98.
[44] James L. R., Demaree R. G., & Wolf G. (1993). rwg: An assessment of within-group interrater agreement. Journal of Applied Psychology, 78(2), 306-309.
[45] Jebb A. T., Tay L., Ng V., & Woo S. (2019). Construct validation in multilevel studies. In S. E. Humphrey & J. M. LeBreton (Eds.), The handbook of multilevel theory, measurement, and analysis(pp. 253-278). Washington, DC: American Psychological Association.
[46] Jiang K., Chuang C.-H., & Chiao Y.-C. (2015). Developing collective customer knowledge and service climate: The interaction between service-oriented high-performance work systems and service leadership.Journal of Applied Psychology, 100(4), 1089-1106.
[47] Kirkman B. L., Tesluk P. E., & Rosen B. (2001). Assessing the incremental validity of team consensus ratings over aggregation of individual-level data in predicting team effectiveness.Personnel Psychology, 54(3), 645-667.
[48] Klein K. J., Conn A. B., Smith D. B., & Sorra J. S. (2001). Is everyone in agreement? An exploration of within-group agreement in employee perceptions of the work environment.Journal of Applied Psychology, 86(1), 3-16.
[49] Klein, K. J., & Kozlowski, S. W. J. (2000). From micro to meso: Critical steps in conceptualizing and conducting multilevel research.Organizational Research Methods, 3(3), 211-236.
[50] Kozlowski, S. W. J., & Hattrup, K. (1992). A disagreement about within-group agreement: Disentangling issues of consistency versus consensus.Journal of Applied Psychology, 77(2), 161-167.
[51] Kozlowski, S. W. J., & Klein, K. J. (2000). A multilevel approach to theory and research in organizations: Contextual, temporal, and emergent processes. In K. J. Klein & S. W. J. Kozlowski (Eds.), Multilevel theory, research, and methods in organizations(pp. 3-90). San Francisco: Jossey-Bass.
[52] Krasikova, D. V., & LeBreton, J. M. (2019). Multilevel measurement: Agreement, reliability, and nonindependence. In S. E. Humphrey & J. M. LeBreton (Eds.), The handbook of multilevel theory, measurement, and analysis(pp. 279-304). Washington, DC: American Psychological Association.
[53] Lance C. E., Butts M. M., & Michels L. C. (2006). The sources of four commonly reported cutoff criteria: What did they really say?Organizational Research Methods, 9(2), 202-220.
[54] Lang J. W. B., Bliese P. D., & de Voogt A. (2018). Modeling consensus emergence in groups using longitudinal multilevel methods.Personnel Psychology, 71(2), 255-281.
[55] Lang, J. W. B., Bliese, P. D., & Runge, J. M. (in press). Detecting consensus emergence in organizational multilevel data: Power simulations. Organizational Research Methods. doi: 10.1177/1094428119873950
[56] LeBreton J. M., James L. R., & Lindell M. K. (2005). Recent issues regardingrWG, r*WG, rWG(J), and r*WG(J). Organizational Research Methods, 8(1), 128-138.
[57] LeBreton, J. M., & Senter, J. L. (2008). Answers to 20 questions about interrater reliability and interrater agreement.Organizational Research Methods, 11(4), 815-852.
[58] Lüdtke, O., & Robitzsch, A. (2009). Assessing within-group agreement: A critical examination of a random-group resampling approach.Organizational Research Methods, 12(3), 461-487.
[59] Mathieu, J. E., & Chen, G. (2011). The etiology of the multilevel paradigm in management research.Journal of Management, 37(2), 610-641.
[60] Meyer R. D., Mumford T. V., Burrus C. J., Campion M. A., & James L. R. (2014). Selecting null distributions when calculating rwg: A tutorial and review.Organizational Research Methods, 17(3), 324-345.
[61] Morgeson, F. P., & Hofmann, D. A. (1999). The structure and function of collective constructs: Implications for multilevel research and theory development.Academy of Management Review, 24(2), 249-265.
[62] Moritz, S. E., & Watson, C. B. (1998). Levels of analysis issues in group psychology: Using efficacy as an example of a multilevel model.Group Dynamics: Theory, Research, and Practice, 2(4), 285-298.
[63] Newman, D. A., & Sin, H.-P. (2020). Within-group agreement (rWG): Two theoretical parameters and their estimators. Organizational Research Methods, 23(1), 30-64.
[64] Ng K.-Y., Koh C., Ang S., Kennedy J. C., & Chan K.-Y. (2011). Rating leniency and halo in multisource feedback ratings: Testing cultural assumptions of power distance and individualism-collectivism.Journal of Applied Psychology, 96(5), 1033-1044.
[65] O’Neill, T. A. (2017). An overview of interrater agreement on Likert scales for researchers and practitioners.Frontiers in Psychology, 8, 777. doi: 10.3389/fpsyg. 2017.00777
[66] Quigley N. R., Tekleab A. G., & Tesluk P. E. (2007). Comparing consensus- and aggregation-based methods of measuring team-level variables: The role of relationship conflict and conflict management processes.Organizational Research Methods, 10(4), 589-608.
[67] Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Newbury Park, USA: Sage.
[68] Rego A., Cunha M. P., & Simpson A. V. (2018). The perceived impact of leaders’ humility on team effectiveness: An empirical study.Journal of Business Ethics, 148(1), 205-218.
[69] Schaubroeck J. M., Shen Y., & Chong S. (2017). A dual-stage moderated mediation model linking authoritarian leadership to follower outcomes.Journal of Applied Psychology, 102(2), 203-214.
[70] Schneider B., White S. S., & Paul M. C. (1998). Linking service climate and customer perceptions of service quality: Tests of a causal model.Journal of Applied Psychology, 83(2), 150-163.
[71] Shen, J. (2016). Principles and applications of multilevel modeling in human resource management research.Human Resource Management, 55(6), 951-965.
[72] Smith-Crowe K., Burke M. J., Cohen A., & Doveh E. (2014). Statistical significance criteria for therWG and average deviation interrater agreement indices. Journal of Applied Psychology, 99(2), 239-261.
[73] Smith-Crowe K., Burke M. J., Kouchaki M., & Signal S. M. (2013). Assessing interrater agreement via the average deviation index given a variety of theoretical and methodological problems.Organizational Research Methods, 16(1), 127-151.
[74] Van Mierlo H., Vermunt J. K., & Rutte C. G. (2009). Composing group-level constructs from individual-level survey data.Organizational Research Methods, 12(2), 368-392.
[75] Woehr D. J., Loignon A. C., Schmidt P. B., Loughry M. L., & Ohland M. W. (2015). Justifying aggregation with consensus-based constructs: A review and examination of cutoff values for common aggregation indices.Organizational Research Methods, 18(4), 704-737.
[1] LI Mingze, YE Huili, ZHANG Guanglei. The influence mechanism of narcissistic leadership on the formation process of team creativity: A multi-perspective study[J]. Advances in Psychological Science, 2020, 28(9): 1437-1453.
[2] WEI Qiu-Jiang|DUAN Jin-Yun|FAN Ting-Wei. The Analysis and Comparison of Power Manipulations[J]. Advances in Psychological Science, 2012, 20(9): 1507-1518.
[3] Jean-François Bonnefon. The Doctrinal Paradox, a New Challenge for Behavioral Psychologists[J]. , 2011, 19(5): 617-623.
Full text



Copyright © Advances in Psychological Science
Support by Beijing Magtech