24524
Objective Acoustic-Prosodic and Turn-Taking Measures in Interactions with Children with Neurodevelopmental Disorders

Thursday, May 11, 2017: 12:00 PM-1:40 PM
Golden Gate Ballroom (Marriott Marquis Hotel)
D. K. Bone1, S. L. Bishop2, S. Lee1 and S. Narayanan1, (1)University of Southern California, Los Angeles, CA, (2)Psychiatry, University of California San Francisco, San Francisco, CA
Background:  Speech prosody—referring to the manner in which a phrase is uttered to enhance meaning beyond the spoken words—plays a critical role in social reciprocity and affect. Effective expression (and perception) of prosody is essential to portraying (and understanding) communicative intent, and thus enhancing conversational quality. Atypical prosody is a well-documented behavioral marker of ASD that presents across the lifespan, yet it is not well-defined. Descriptions of atypical prosody are qualitative, subjective, and contrasting, and inter-rater agreement remains low. As such, incidence rates for various types of prosodic abnormalities are unknown.

Objectives:  Automatic quantitative analysis of large corpora comprising natural communication with ASD subjects has the potential to provide novel information to researchers and clinicians. Further, given the great heterogeneity of symptoms in autism spectrum disorder (ASD), an acoustic-based objective measure would be valuable for clinical assessment and interventions. In this study, we investigate objective speech features in child-psychologist conversational samples. Expanding upon previous studies, we investigate (i) the speech of children with non-ASD developmental disorders (DD) and (ii) the stability of certain prosodic attributes, gaining insights into the effects of ASD severity and diagnosis on the child’s prosody and quality of interaction.

Methods:  Audio-visual data of semi-structured child-psychologist interactions during the Autism Diagnostic Observation Schedule (ADOS) from two collection sites are used for this study. Data consist of age- and IQ-matched ADOS Module 3 administrations for ASD (N=95) and DD (N=81) subjects. Speech acoustic-prosodic features are computed after first aligning lexical transcriptions to the audio signals. Prosody is quantified in terms of segmental intonation (syllable-level), supra-segmental intonation (multi-syllabic contour modeling via Momel/Intsint parameterization), speech rate (syl/s), and coordination of prosodic attributes (pitch/volume/duration); we also investigate measures of turn-taking. Analyses are conducted via correlation and predictive regression between prosodic cues and ADOS severity.

Results:  The automatically extracted prosodic and turn-taking cues correlate with the child’s ASD severity/diagnosis and are demonstrated to have significant predictive performance. For example, in interactions with children having higher ASD severity: segmental and supra-segmental prosodic variability increases for both participants; the child has reduced coordination between their pitch and duration/volume; and the child speaks less, at a slower rate, and with more pausing. Additionally, the psychologist’s speech features were as predictive of the child’s severity as the child’s features. We will also provide a statistical analysis of the stability of these vocal characteristics across the interaction.

Conclusions:  The acoustic-prosodic and turn-taking cues are reflective of ASD severity and diagnosis. Likewise, the psychologist, who is also an interlocutor in the interaction, adjusts her behavior in predictably ways. This work is part of a larger effort to create an automatic system for evaluating various dimensions of naturalistic social prosody. Findings support further, large-scale study of objective measures of prosody in interactions involving children with ASD.

Figure 1: Example intonation contour with corresponding Momel/Intsint modeling.