27847
Denoising Autoencoders for Self-Supervised Clinical Subtyping of Autism Spectrum Disorder

Poster Presentation
Saturday, May 12, 2018: 11:30 AM-1:30 PM
Hall Grote Zaal (de Doelen ICC Rotterdam)
C. H. Chatham1, J. F. Hipp2, P. Garces3, S. Holiga3, J. Tillmann4, E. J. Jones5, M. Uljarevic6,7, G. Dumas8, E. Loth9, T. Charman10 and G. Honey11, (1)Neuroscience, Ophthalmology, and Rare Diseases (NORD) Roche Pharma Research and Early Development. Roche Innovation Center Basel, Hoffmann La Roche, Basel, Switzerland, (2)Neuroscience and Rare Diseases (NRD), Roche Pharma Research and Early Development, Roche Innovation Center, Basel, Switzerland, (3)Neuroscience, Ophthalmology, and Rare Diseases (NORD) Roche Pharma Research and Early Development. Roche Innovation Center Basel, Hoffmann-La Roche, Basel, Switzerland, (4)Institute of Psychiatry Psychology & Neuroscience, London, United Kingdom, (5)Centre for Brain and Cognitive Development, Birkbeck, University of London, London, United Kingdom, (6)Department of Psychiatry and Behavioral Sciences, School of Medicine, Stanford University, Stanford, CA, (7)Stanford Autism Center, Department of Psychiatry and Behavioral Sciences, Stanford University, CA, (8)Human Genetics and Cognitive Functions Unit, Institut Pasteur, Paris, France, (9)Forensic and Neurodevelopmental Sciences, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom, (10)Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, United Kingdom, (11)F. Hoffman La-Roche Ltd, Basel, Switzerland
Background: Several measures have been proposed as candidate biomarkers for ASD but their correlations with clinical symptom severity are typically weak. Consequently, these biomarkers may be less prognostic of clinically-meaningful change, and less revealing of specific mechanisms to target for alleviating distress. One potential cause of these weak correlations is small underlying effects. Alternatively, larger correlations could be attenuated by population heterogeneity. For example, if distinct subtypes of ASD have distinct biomarker/clinical severity associations, then effect sizes could be reduced when ASD is treated as a single, unitary disorder. Although much prior work attempts to uncover the sources of heterogeneity in ASD, most techniques make restrictive assumptions about the nature of this heterogeneity. For example, classical clustering methods can uncover categorical differences within a population, but may give misleading results if dimensional differences exist. Conversely, classical factor analytic methods can recover dimensions, but may give misleading results in the presence of latent categories. There is a clear need for more equipotent techniques, where both latent categorical and latent dimensional sources of variation can be uncovered simultaneously in a data-driven manner.

Objectives: The objective of this study was to uncover latent categorical and/or dimensional clinical features of individuals with ASD, and to examine whether these features may disattenuate the correlation between biomarkers and the severity of clinical symptomatology.

Methods: We apply a self-supervised neural network architecture, the denoising autoencoder, to item-level scores from individuals with ASD on the Autism Diagnostic Interview-Revised, Social Responsiveness Scale, and Repetitive Behavior Scale-Revised, as well as v-scale scores from the Vineland-II Adaptive Behavior Scales. Input data was normalized for age and IQ, to increase variance related to atypical clinical symptomatology. Data from the Simons Simplex Collection (SSC; n=2329 in this analysis) was used to optimize the model through internal cross-validation. Generalization was assessed on a large holdout dataset (EU-AIMS LEAP; n=334 in this analysis).

Results: Our machine learning approach showed good cross-validation performance when item-level scores were compressed to 9 latent features. The model showed minimal overfitting when applied to the holdout dataset, EU-AIMS LEAP. Nine dimensions sufficed for reconstructing up to 85% of the variance in item-level scores, and never reconstructed less than 5% of the variance for any IQ- and age-adjusted item. Moreover, 2 subgroups of individuals with ASD could be reliably extracted from these 9 features, with one subgroup showing a statistically-significant pattern of increased severity across multiple domains (social responsiveness, repetitive behavior and sensory interests, inattentiveness symptoms, gastrointestinal symptoms, the presence of non-febrile seizures, and difficulties with sleep) but, notably, not full-scale IQ. Correlations between neuropsychological performance and clinical severity were (as a whole) larger within this highly-affected subgroup.

Conclusions: Modern machine learning techniques enable the extraction of both continuous and categorical differences among individuals with ASD. Failing to account for these sources of heterogeneity may attenuate the correlation of biomarkers and associated neuropsychological tests with measures of clinical severity.