Do Predictors in Machine Learning Classification of ASD Differ for Children Vs. Adolescents?

Poster Presentation
Friday, May 3, 2019: 11:30 AM-1:30 PM
Room: 710 (Palais des congres de Montreal)
S. Plate1, E. Sariyanidi2, M. Cola1, C. J. Zampella2, L. D. Yankowitz1, A. Riiff1, V. Petrulla1, B. Tunc2, J. D. Herrington2, K. Bartley2, E. S. Kim1, A. de Marchena3, J. Pandey1, R. T. Schultz1 and J. Parish-Morris1, (1)Center for Autism Research, Children's Hospital of Philadelphia, Philadelphia, PA, (2)Center for Autism Research, The Children's Hospital of Philadelphia, Philadelphia, PA, (3)University of the Sciences, Philadelphia, PA
Background: Autism spectrum disorder (ASD) is defined by early and persistent deficits in social communication, as well as the presence of restricted interests and repetitive behaviors. The majority of children with ASD are verbally fluent, and information gathered from brief natural language samples could facilitate remote screening while generating ecologically valid social communication profiles to inform personalized treatment planning. Prior research suggests that a variety of linguistic features produced by participants with ASD and their conversational partners are useful predictors of diagnostic status and/or symptom severity, including prosody, turn-taking rates, and word choice (Bone, 2013; Parish-Morris et al., 2016). However, few studies have harnessed the power of machine learning to predict diagnosis from short, conversational language samples with non-expert interlocutors, and little is known about whether prediction accuracy and specific predictive features remain consistent in childhood vs. adolescence. This study addresses these two important gaps.

Objectives: Apply machine learning to language features extracted from transcripts of naturalistic conversations, with the goals of (1) classifying participants as ASD or typically developing, and (2) comparing classification accuracy and predictive features between a child sample, an adolescent sample, and a collapsed sample that includes all participants.

Methods: Eighty-five matched participants (Table 1) participated in two 3-minute semi-structured “get to know you” conversations with two previously unknown confederates who were not autism experts (Ratto et al., 2011). In the first conversation, the confederate is trained to act interested in the conversation, and in the second, bored. Transcripts were analyzed using LIWC software (Tausczik & Pennebaker, 2010) and R’s ‘qdap’ package (Rinker, 2017), resulting in 121 features for participants and confederates in each condition, as well as the difference between conditions. Our machine learning pipeline included a logistic regression classifier trained with participant and/or confederate features within a leave-one-out-cross-validation loop. Cross-validated classification accuracy was measured within children and adolescent samples separately, as well as across the entire age range; accuracy was compared using McNemar’s test. Conversational features with non-zero coefficients in the classifier were identified as top predictors of diagnostic status.

Results: Diagnostic classification accuracy was high in both age groups: 89% in adolescents and 76% in younger children (Table 2). Accuracy dropped significantly to 66% (p<.015) when the entire age range was classified within a single model, suggesting that optimal classification models may differ by age group. The most accurate classification model was driven by participant-level features for children and by confederate-level features for adolescents. For children, top predictive features included participant pronoun use, intra-turn pause duration, and “friend”-category words. For adolescents, top predictive features in the most parsimonious model included confederate word-level “authenticity” and negations.

Conclusions: This study showed that (1) features derived from naturalistic conversations with non-expert interlocutors can be used for diagnostic classification, and (2) top classification features may change over the course of development. Using machine learning to extract clinically-relevant dimensions from short, naturalistic conversation samples with naïve confederates could provide a new path toward rapid improvements in remote screening, characterization, and developing yardsticks for measuring treatment response.