ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

Advances in Psychological Science ›› 2025, Vol. 33 ›› Issue (12): 2083-2104.doi: 10.3724/SP.J.1042.2025.2083

• Meta-Analysis • Previous Articles     Next Articles

Emotional speech recognition in children with autism spectrum disorder: A three-level meta-analysis of prosodic, semantic, and integrative deficits

CHEN Lijun1, JIN Yuexin1, ZENG Hanhan2, JIANG Xiaoliu3()   

  1. 1Department of Psychology, Faculty of Humanities and Social Sciences, Fuzhou University, Fuzhou 350108, China
    2School of Sociology and Psychology, Central University of Finance and Economics, Beijing 100098, China
    3Department of Social Psychology, School of Sociology, Nankai University, Tianjin 300350, China
  • Received:2025-04-17 Online:2025-12-15 Published:2025-10-27

Abstract:

Vocal emotion recognition is a foundational component of effective social communication. However, children with Autism Spectrum Disorder (ASD) frequently experience difficulties in this domain. While prior research has demonstrated that individuals with ASD show impairments in processing emotional information conveyed through speech, there is little consensus on the precise source of this difficulty. Is the core deficit primarily located in prosodic processing, semantic comprehension, or in the integrative mechanisms required to reconcile multiple emotional cues? To address this critical question, the current study conducted a three-level meta-analysis encompassing 47 independent studies, yielding 93 effect sizes and a total sample of 3,142 participants. This meta-analytic framework allows for the modeling of nested data structures and provides a statistically rigorous estimation of overall deficits and their moderators across varied task designs, cultural contexts, and developmental stages.

The results revealed a significant overall deficit in the vocal emotion recognition performance of children with ASD compared to their typically developing peers (Hedges’ g=-0.71), reflecting a moderate to large effect size. Importantly, the magnitude of the deficit was systematically related to task type: the largest impairments occurred in integrative tasks requiring the simultaneous processing of semantic and prosodic cues (g=-0.90), followed by tasks emphasizing prosodic features alone (g=-0.61), and then by semantic-only tasks (g=-0.49). This pattern offers robust support for the Weak Central Coherence (WCC) theory, which posits that individuals with ASD exhibit a domain-general impairment in integrating multiple channels of information. Even tasks that appear unimodal on the surface—such as recognizing emotional tone or understanding emotional vocabulary—often entail the coordination of subtle subcomponents (e.g., combining pitch, intensity, and duration for prosody, or word meaning, syntax, and context for semantics), which can disproportionately tax integrative processing in ASD.

Further, a series of moderator analyses revealed key contextual and methodological factors shaping the extent of impairment. Cultural context was a significant moderator of performance differences (p=0.023). Children with ASD from high-context cultures (e.g., many East Asian cultures, where communication often relies on implicit tone and context) exhibited more pronounced deficits in emotional speech recognition (g=-1.03) than those from low-context cultures (e.g., North America or Western Europe, g=-0.60). This finding underscores the importance of considering the sociocultural environment in understanding emotional development in ASD and designing culturally sensitive interventions. Material type also emerged as a powerful moderator (p<0.001). When semantic and prosodic cues were congruent, group differences between ASD and typically developing children were relatively small. However, when cues were conflicting or ambiguous, ASD children's performance deteriorated substantially. This suggests that the integrative challenge—especially in resolving conflicting information—may constitute the most critical barrier to successful emotion recognition in ASD. Contrary to the developmental delay hypothesis, age did not significantly moderate the observed deficits, suggesting that impairments in vocal emotion recognition are not simply a reflection of delayed maturation, but may instead represent a stable and persistent neurocognitive feature of ASD, thereby calling for intervention strategies that move beyond age-compensation frameworks.

In addition, we observed that the influence of task type interacted significantly with cultural context (p=0.035) and with emotion type (p=0.019). Specifically, children with ASD from high-context cultures demonstrated the greatest difficulties in vocal emotion recognition during integrated tasks, showing the largest performance gap compared to typically developing children when identifying mixed and complex emotions in such tasks. Further exploratory analyses revealed nuanced interactions between task type and diagnostic subtype (p=0.060). For instance, children with Asperger’s syndrome exhibited relatively better performance on semantic tasks, likely reflecting their preserved linguistic capabilities. However, this advantage did not extend to prosodic or integrative conditions, where performance was similarly impaired as in other subtypes. This pattern further supports the idea that superficial language proficiency in ASD may mask deeper deficits in emotional inference, particularly in contexts requiring multimodal integration.

This meta-analysis provided a nuanced understanding of vocal emotion recognition deficits in children with ASD. The findings demonstrated that the impairment is robust across contexts, highlighting that the greatest difficulty lies in integrating multiple communicative cues rather than in processing prosody or semantics alone. By identifying integrative processing as a critical weakness—and noting the unique patterns associated with cultural context and ASD subtypes—the study provides a foundation for developing targeted, evidence-based strategies to enhance social communicative functioning in children with ASD.

Key words: autism spectrum disorder, emotional speech recognition, semantics cues, prosodic cues, meta- analysis

CLC Number: