‘Autistic’-Sounding: A Latent Class Linear Mixed Modeling Approach to Parsing Heterogeneity in Children’s Natural Conversations Using Acoustic Properties of Speech

Poster Presentation
Friday, May 3, 2019: 5:30 PM-7:00 PM
Room: 710 (Palais des congres de Montreal)
J. Parish-Morris1, S. Cho2, M. Liberman3, N. Ryant2, K. Bartley4, M. Cola1, S. Plate1, L. D. Yankowitz1, V. Petrulla1, A. Riiff1, C. J. Zampella4, J. D. Herrington5, E. Sariyanidi4, B. Tunc4, E. S. Kim1, A. de Marchena6, J. Pandey1 and R. T. Schultz1, (1)Center for Autism Research, Children's Hospital of Philadelphia, Philadelphia, PA, (2)Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, (3)University of Pennsylvania, Philadelphia, PA, (4)Center for Autism Research, The Children's Hospital of Philadelphia, Philadelphia, PA, (5)Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, (6)University of the Sciences, Philadelphia, PA

Behavioral heterogeneity is a persistent challenge for researchers and clinicians aiming to develop evidence-based social communication interventions for children with autism spectrum disorder (ASD), and to pinpoint the condition’s biological bases. Even after attempting to manufacture homogeneity by restricting variables such as age and IQ within study samples, children with ASD still behave very differently across contexts (e.g., consider a standardized vocabulary test versus playground conversation). This study uses a latent growth curve approach to parse acoustic heterogeneity in the spontaneous speech of children with ASD.


Test whether patterns of ‘ASD’-like utterances characterize subgroups of children with ASD over the course of a short, naturalistic conversation with a friendly stranger.


Language samples from 35 verbally fluent children with ASD were drawn from an unstructured 5-minute ‘get-to-know-you’ conversation with a novel confederate who was not an autism expert. All children had IQ estimates in the average range (>75), and were aged 7-16.9 years. Children produced a total of 2,408 useable utterances (mean=68.8 utterances each). Each utterance was classified as ‘ASD’ or ‘TD’ using a machine learning classifier with 5-fold cross-validation, developed on the acoustic properties of speech produced by a larger matched sample that included both diagnostic groups (sample described in Cho et al., this conference). The number of ‘ASD’-like utterances produced over the course of the conversation (~1-minute windows) was tested for the presence of latent classes (‘lcmm’ in R). Class member characteristics were compared using simple linear models.


A 2-class model provided the best fit for the data (as compared to 3 or 4 classes) and revealed evidence of homogeneous subgroups with (1) Increasing (N=23) or (2) Steady (N=12) numbers of ASD-like speech utterances over the course of the conversation (Figure). Group intercepts differed significantly from one another, with the Increasing group producing more ASD-like utterances at the start of the conversation (coefficient: -2.08, Wald test=-2.40, p=.02). Members of the Increasing subgroup produced growing numbers of utterances classified as ‘ASD’ over time (Coefficient=.49, Wald test=5.97, p<.0001), while the relationship between time and ASD-like utterances trended negative in the Steady subgroup (Coefficient=-.18, Wald test=-1.65, p<.10. Class members did not differ on age, sex ratio, nonverbal IQ estimates, ADOS-2 calibrated severity scores, average turn length, or the number of utterances produced at the group level, but did differ on verbal IQ scores (Steady > Increasing; estimate=11.58, t=2.97, p=.003) and word count (Steady < Increasing; estimate=-150.58, t=-2.88, p=.007).


Machine-learning classification of speech utterances renders it possible to parse heterogeneous samples into more homogeneous subgroups that dynamically change over the course of a conversation. In this exploratory study, we found two subgroups of children that sound more or less ‘ASD-like’ over time, with the more talkative group sounding increasingly atypical over 5 minutes. Future research using an expanded sample will include language-based analyses within each class (current results are based on acoustic properties only). This ‘profiling’ approach holds promise for identifying subgroups that benefit from specific interventions and stands to advance the goal of personalized medicine.