Quantitative Analysis of Prosody in Conversational Speech in Autism Spectrum Disorders and in Developmental Language Disorders

Thursday, May 17, 2012
Sheraton Hall (Sheraton Centre Toronto)
1:00 PM
G. Kiss, J. van Santen, E. T. Prud'hommeaux and L. M. Black, Center for Spoken Language Understanding, Oregon Health & Science University, Beaverton, OR
Background:  

The diagnosis of Autism Spectrum Disorders (ASD) is labor intensive and requires highly trained professionals, limiting access to diagnostic and hence intervention services. Automated analysis of conversational speech could potentially aid in providing more broadly accessible means for identifying high-risk individuals.

Although speech prosody is often atypical in ASD, prosody plays only a minimal role in diagnostics, possibly because of reliability issues. Relatively few studies exist on prosody in ASD using adequately-sized and well-characterized samples, and using quantitative prosodic measures. Moreover, almost no studies exist that compare prosody in ASD and Developmental Language Disorders (DLD), an important comparison given the symptomatic overlap between these disorders. 

Objectives:  

Our goal was to identify quantitative prosodic features that can reliably differentiate typical development (TD), DLD, and children with ASD children also diagnosed with DLD (ASD+DLD) or not (ASD-DLD), using ADOS recordings. 

Methods:  

We analyzed recordings for children aged 4 to 8, diagnosed with TD, DLD, ASD+DLD, or ASD-DLD. All were verbal, intelligible, and had an MLU > 3. We contrasted pairs of groups matched on specific factors:

1)      ASD-DLD and TD (matched on verbal IQ)

2)      ASD+DLD and DLD (verbal IQ)

3)      ASD+DLD and ASD-DLD (ADOS and SCQ scores)

4)      DLD and TD (ADOS and SCQ scores)

We additionally matched groups on age, and, for comparisons 1) and 2), also on performance IQ.

For each child we created overall histograms of the pitch values of the entire recording, and per-utterance pitch histograms of each individual utterance. For our prosodic features, we computed standard statistical parameters (e.g., mean, variance, asymmetry, peakedness) for these overall histograms as well as means and variances of the same parameters computed for each per-utterance histogram (e.g., the variance of the per-utterance means).

We also analyzed the spectral content using LTAS (Long Term Average Spectrum; pitch-normalized), and determined “phonemic content” via the histogram of all phoneme classes, extracted from the phonetic transcript. 

Results:  

Several prosodic features discriminated significantly (at p<0.05) between groups. For example, overall peakedness and mean per-utterance peakedness was much higher in the TD group than in the other groups, whereas corresponding spread measures were smaller. The difference in peakedness between ASD-DLD and TD was particularly pronounced, as was the difference in the location parameter (mean pitch), the latter being higher in ASD-DLD. Using only the peakedness feature (measured by kurtosis), we achieved a classification rate of over 75% percent (chance being 50%) contrasting the ASD-DLD and TD groups.

No significant differences in the spectral and phonemic content were detected, suggesting that there were no significant differences in articulation, which is plausible given that all children were verbal and intelligible. 

Conclusions:  

Children with TD had generally narrower pitch ranges than children with ASD or DLD. Future research will focus on relating these distribution-parameter results to differences in pitch curve shapes. Automatic classification methods using only these prosodic features can perform significantly better than chance. These features are robust and easy to extract from conversational speech, making them good candidates for use in automated screening methods.

| More