Multimodal Emotion Recognition in Children with Autism Spectrum Disorder: Vocalizations Are More Informative Than Faces or Music
Objectives: The objective was to provide an initial comparison of emotion recognition skills in children with ASD across the modalities of face, vocalization, or music.
Methods: Twenty-two children with ASD between the ages of 8-14 (M=10.7, SD=1.49) years participated in this study. They completed tasks associated with identifying emotions in faces, vocalizations, and music, presented in counterbalanced order. Each modality included 8 happy, 8 sad, and 8 fearful emotional expressions from validated stimuli sets for a total of 72 stimuli. After each stimulus was presented on a computer screen for 1.5-2 seconds, the participants selected the verbal label that described the stimuli best from the three possible options of happy, sad, fearful.
Results: Friedman nonparametric repeated measures t-tests revealed mean accuracy scores that were significantly above chance (ranging from 64-93%). A difference in emotion recognition accuracy was found across modalities, p = .0001. Post hoc analysis with Wilcoxon signed rank tests (Bonferroni p < .017) revealed that the participants more accurately identified emotions from vocalizations than faces (p < .001) and music (p < .0001), but did not differ in accuracy for faces versus music (p = .072). A difference in accuracy was found across emotions p = .031, as the participants more accurately identified happy emotions versus fear (p = .005). No differences in accurate identification were found between happy versus sad (p = .039) and sad versus fear (p = .65). A significant modality by emotion interaction was found, p < .0001. Post hoc analyses with a Bonferroni correction of p < .005 revealed that sadness was more easily identified in vocalizations than music (p = .001), and fear was more easily identified in vocalizations than faces (p = .001) and music (p < .001). No significant differences were found in the accurate identification of happy emotions across modalities (p > .009), or between sad faces and vocalizations (p = .16), sad faces and music (p = .045), or fearful faces and music (p = .235).
Conclusions: Significant differences in identifying emotions across modalities were found as the participants identified sadness and fear more accurately in vocalizations than in faces and music, but identified happy emotions equally accurately across all modalities. Contrary to expectations, these findings suggest that the identification of negative emotions when conveyed through non-verbal vocalizations, rather than faces or music, may be easier or more informative for children with ASD.