16346
Computational Vocal Arousal: An Objective Instrument for Studying Affect and Interaction in ASD

Friday, May 16, 2014
Meeting Room A601 & A602 (Marriott Marquis Atlanta)
D. K. Bone1, C. C. Lee1, M. P. Black1, M. E. Williams2, S. Lee1, P. Levitt3 and S. Narayanan1, (1)Signal Analysis and Interpretation Lab (SAIL), University of Southern California, Los Angeles, CA, (2)University Center for Excellence in Developmental Disabilities, Keck School of Medicine of USC, Children’s Hospital Los Angeles, University of Southern California, Los Angeles, CA, (3)Children’s Hospital Los Angeles and Keck School of Medicine of USC, University of Southern California, Los Angeles, CA
Background: Acoustic features of speech are influenced by the speaker’s internal emotional state. In particular, emotional arousal is strongly reflected in certain prosodic cues: pitch, intensity, high-frequency energy, and speaking rate (although these can also be consciously altered for socio-communicative purposes). We have developed a robust tool (published) that automatically captures relative changes in vocal arousal over time from a person’s speech data. The tool is inspired by a multitude of empirical evidence, further backed by psycho-motor theory; for example, a person’s pitch is expected to increase when experiencing fear (high arousal) because the laryngeal folds will tighten as a sympathetic response. Autism spectrum disorder (ASD) research regularly focuses on prosody and affect as behavioral markers. Therefore, we propose multiple uses of this tool for affective and interactional study in ASD.

Objectives: In order to demonstrate the utility of the tool in ASD research, we will conduct three experiments. In the first, we analyze how children with varying degrees of social-communicative difficulties and the interacting psychologists (clinicians) express affect during activities that impose different social loads on the child. In the second experiment, we analyze how an interacting child and psychologist influence one another as implicated by arousal temporal-dynamics. Finally, we propose the use of computational vocal arousal as a means for further analytics (e.g., data selection), which can then be coupled with other data sources (e.g., lexical information, i.e., the words spoken). 

Methods: Audio-video data of Autism Diagnostic Observation Schedule (ADOS) Module 3 (verbal, N=28, 5.8-15.0 years of age) semi-structured child-psychologist interactions were collected. Data were first manually transcribed with utterance boundaries. A computational vocal arousal rating for child and psychologist is made for each utterance based on pitch, vocal intensity, and the ratio of high-frequency energy—all in reference to a speaker’s baseline. Social-communicative difficulty is defined by overall ADOS severity. Social load was designated on a 5-pt scale for each ADOS Module 3 activity by seven clinicians experienced in autism assessment. Social load was grouped into three levels (high, medium, and low). Correlation and granger causality are used to quantify the mutual influence between participants’ arousal. Various measures are considered for behavioral saliency based on vocal arousal.

Results: Initial evidence suggests that the psychologist has higher relative vocal arousal during high social demand activities when interacting with children that have greater social-communicative difficulties. This is interpreted with regard to other conversational and turn-taking findings from previous studies within this data. Additionally, we find that children with greater ASD severity are less responsive to changes in the psychologist’s arousal. Lastly, we provide examples of other potential uses of this vocal arousal measure.

Conclusions: Vocal arousal as obtained by this freely available tool (currently implemented in Matlab) is a useful measure of expressed emotional arousal. In this work, we show its utility for analyzing arousal compared to social load for child and psychologist, mutual influence of arousal between speakers, and in conjunction with other modalities as a possible measure of behavioral saliency.