ISSN 0439-755X
CN 11-1911/B

Acta Psychologica Sinica ›› 2023, Vol. 55 ›› Issue (1): 94-105.doi: 10.3724/SP.J.1041.2023.00094

• Reports of Empirical Studies • Previous Articles     Next Articles

The influence of aging on the unmasking effect of F0 contour cue in Chinese speech recognition

WU Meihong()   

  1. School of Informatics, Xiamen University;Institute of Psychology, Xiamen University, Xiamen 361005, China
  • Published:2023-01-25 Online:2022-10-18
  • Contact: WU Meihong E-mail:wmh@xmu.edu.cn

Abstract:

Older adults encounter difficulty in recognizing speech in environments where multiple people are talking. Fundamental frequency (F0) contour is very important for speech recognition in daily communication and can serve as a perceptual cue to segregate speech from background noise. The effect of dynamic F0 contour cues on the speech recognition of younger adults in noisy environments has been widely studied, but the influence on older adults’ speech recognition, especially in tonal languages like Chinese, is still unclear.

The present study explores whether older adults can benefit from dynamic F0 contour cues for Chinese speech recognition under the masking of speech with 12 elderly participants (6 female and 6 male, mean age 68.6 years) and 12 young participants (7 male and 5 female, aged 18~25 years old) possessing normal peripheral hearing. Figure 1 presents the group mean of the hearing thresholds as a function of frequency. As shown in Figure 1, all the participants had symmetrical hearing.

The speech recognition threshold for natural F0 contour sentences and the corresponding sentences with F0 contour manipulations (flattened vs. exaggerated) under two-speaker anomalous speech masking for younger adults and older adults were measured. Participants were asked to loudly repeat the entire target sentence as best as they could immediately after all the stimuli ended in each trail. Figure 2 illustrates group-mean percent-correct word identification as a function of signal-to-noise ratio level along with the group-mean best-fitting psychometric functions curves.

Data for speech recognition threshold were obtained and these data were analyzed using Linear Mixed-Effects modeling. The results showed that, for the younger group, the main effect of F0 contour type was significant, F(2, 22) = 6.87, p = 0.005, but for the older group, the main effect of F0 contour type was not significant, F(2, 22) = 0.50, p = 0.614. Furthermore, data analysis revealed that, for the younger group, speech recognition performance was significantly poorer in all the manipulated F0 contour conditions as compared to the natural F0 contour condition (flatten: β = 2.92, SE = 0.88, t = 3.33, p = 0.003; exaggerated: β = 2.70, SE = 0.88, t = 3.08, p = 0.005). However, for the older group, data analysis showed that speech recognition performance under all the manipulated F0 contour conditions was not significantly different from that under the natural F0 contour condition (flatten: β = 0.34, SE = 0.34, t = 0.99, p = 0.33; exaggerated: β = 0.16, SE = 0.34, t = 0.48, p = 0.637).

Data from the younger group demonstrated a pattern that was different from the older group data in terms of the effects of dynamic F0 contour cues. The results showed that, under masking with two speakers, the natural dynamic F0 contour cues assist young adults to resist information masking more than a flattened or exaggerated F0 contour, but for the older adults, the speech intelligibility of the target sentences with a natural F0 contour was as poor as that of target sentences with flattened or exaggerated F0 contour.

The availability of F0 contour cues seriously affects the benefits of dynamic F0 contour cues for older adults’ speech recognition against speech masking. There also appears to be an age-related reduction of the benefit from dynamic F0 contour cues in masked speech recognition, so the F0 contour of Chinese sentences may contribute more to speech recognition under speech masking for younger adults than for older adults.

Key words: Chinese speech recognition, fundamental frequency contour, age-related deficits, speech masking, unmasking