19095
Online Queries of Parents Suspecting Their Child Has ASD: A Clinician Mediated Machine Learning Prediction of ASD Risk

Thursday, May 14, 2015: 5:30 PM-7:00 PM
Imperial Ballroom (Grand America Hotel)
E. Yom-Tov1 and A. Ben-Sasson2, (1)Microsoft Research Israel, Herzeliya, Israel, (2)University of Haifa, Haifa, Israel
Background: The increasing rates of autism spectrum disorders (ASD) and the growing awareness of them lead more parents to suspect ASD in their child. These early concerns can precede their referal of a professional by months. Parents are increasingly approaching online communities for information about their child’s development expecting to verify/dispute their concerns by a largely non-professional community. Online queries are a testimony to the signs alarming parents. Machine learning tools may offer a way to facilitate an estimate of the degree of ASD risk of the child in an online query.

Objectives: (1) Identify signs that differentiate online queries of children with low, medium, versus high ASD risk as rated by clinicians; (2) Test the efficacy of machine learning tools in classifying a child for ASD risk.

Methods: Yahoo Answers, a social question and answering site, was mined for queries of parents asking the community whether their child has ASD. The 194 questions from the resulting corpus were sampled for this study (Mean age = 39.59 months; 72% < 3 years; 86.32% boys). Domain expert clinicians performed a content analysis of the types of signs described in the question. The child’s risk level for ASD in the question was rated as low, medium, and high by a different clinician. Machine learning tools were applied for predicting risk from text or coded signs.

Results: Of the 194 questions, 31 were rated as low risk, 55 medium risk, and 108 high risk. There was no difference in gender distribution between risk groups. Chi square tests showed that questions rated as high risk contained a significantly (p< .05) higher rate of social, communication, language, and cognitive problems as well as repetitive and restricted behaviors in text. Of the 55% of parents that mentioned a language problem there was a 95.3% chance of high or medium risk compared to 70.4% for the rest of the sample. 20.6% of the sample did not mention language problems and no repetitive and restricted behaviors. For them the chance of high or medium risk was 55%. 24.7% of parents did not report language problems but reported repetitive and restricted behaviors. For them the chance of high or medium risk was 83.3%. When predicting if a question was medium or high risk using the text of the question, an automatic classifier (decision tree) reached an Area Under the Curve (AUC) of 0.63, compared to 0.78 when the classifier used coded signs.

Conclusions: Most children for which parents suspect they have ASD are judged by a clinician as in need for a clinical evaluation. Parents report signs in all core ASD diagnostic domains as well as concerns related to cognitive functioning. Accurately predicting ASD risk from a question requires a mediating step of classifying text into symptom domains. Findings support the need for a computerized tool to assist parents in tagging their concerns in order to obtain a risk estimate.