Smart Tablet-Based Gameplay Identification of Preschool Children with Autism: A Replication Study with Machine Learning Data Analytics Improvements

Delafield-Butt, Jonathan

Background: It has been proposed that one of the early markers of autism spectrum disorder (ASD) is a disruption in intentional movement evident from early childhood. Evidence suggests disruption to motor timing and integration may underpin the disorder, providing a new potential marker for its identification. In earlier work, we demonstrated machine learning analysis of children’s movement patterns during smart tablet gameplay identified ASD with 83% sensitivity and 85% specificity (Anzulewicz, Sobota and Delafield-Butt, 2016).

Objectives: In this study, we sought to test the original performance accuracy with more generalised, new data. And we sought an iterative improvement on the machine learning data analytics to simplify and further generalise the models. Overall we aimed to achieve an accessible, computational identification of ASD in young children by smart tablet gameplay.

Methods: The original study of 37 children 3-6 years old with ASD and 45 children typically developing (TD) was augmented with a new dataset of 118 children with ASD and 420 TD children. In addition, 26 children 3-6 with another neurodevelopmental disorder that was not ASD was included. Feature selection was reduced by recursive feature selection and removal of low variance and high within-group correlations. New machine learning algorithms were trained on the new dataset (n=564), and these models applied to the original dataset (n=82) to test for generalisation.

Results: Dimensionality was reduced from 262 kinematic and descriptive metric features of children’s gameplay patterns to 49 features. Ten repetitions of a ten-fold cross-validation procedure performed on the new dataset (n=564) identified children with ASD from their TD counterparts with 87% sensitivity and 85% specificity. Differentiation of OND from their TD counterparts was comparable, but with low confidence. Finally, we tested the models produced on the original study dataset (n=82). The model performed 83% sensitivity and 82% specificity accuracy, replicating the original finding.

Conclusions: This study produced new machine learning models for the identification of ASD from TD children on a large dataset with comparable performance to the first study, and with reduced feature selection. Moreover, we replicated the findings of our previous study with these new algorithmic models, tested on those original data and without prior training on those data. We consider this strong verification of the principle of machine learning data analytics in the successful, and potentially clinically useful early identification of ASD in young children. The basis of these features on calculations of motor kinematics supports the view movement differences are a fundamental feature of ASD that may be subtle to the eye, but significantly associated computationally.

32275 Smart Tablet-Based Gameplay Identification of Preschool Children with Autism: A Replication Study with Machine Learning Data Analytics Improvements

32275
Smart Tablet-Based Gameplay Identification of Preschool Children with Autism: A Replication Study with Machine Learning Data Analytics Improvements