18001
The Potential of an Audio-Based Automated Autism Screen: The Result of a Blind Test Using Third-Party Data
Objectives: Third-party data are desired for further analysis, validation and improvement. Questions are asked: if the performance can hold for a blind test with third-party data; if the behavior characteristics extracted from audio recordings can show consistency when applied to new data; and if any potential issues or improvements can be identified with blind third-party data. This study intends to answer the questions.
Methods: Daylong audio recordings were collected using wearable LENA recorders. The automated algorithm detected key-child, adults and other environment sounds. The statistics in the sequence of sound categories in a child’s recording can reflect how the child interacts with the environment. Even the synchrony between the child and caregivers can be indicated by e.g. the co-vocalization rate between them. Human voice was further processed via phone recognition or sound clustering algorithms, providing frequencies of occurrence for phones, sound clusters and their sequences which are highly correlated with language phonetic and vocal development. Prosodic features such as duration, loudness and pitch are highly related to emotions and other behaviors. More than 100 features were analyzed and modeled to provide the risk score for autism using machine learning approaches. The algorithms were trained with the in-house data, and tested with the third-party data.
Results: The third-party data were from three sources using the same type of recorders. Site-1-data had 59 daylong recordings from 31 children with autism (25-48 months); site-2-data had 125 recordings in preschool environments from 67 children with autism (36-68 months) and site-3-data had 115 daylong recordings from 40 children of typical development (11-22 months). Two methods were tested for autism risk. For Method-1 with the trained cutoff threshold, 88 among 98 ASD children were positive (90% sensitivity) and 38 among 40 TD children were negative (95% specificity); varying the threshold gave 95% equal-sensitivity/specificity. For Method-2 with the trained threshold, 84 among 98 ASD children were positive (86% sensitivity); 36 among 40 TD children were negative (90% specificity); and the equal-sensitivity/specificity was 90%.
Conclusions: The test confirmed the performance of around 90% sensitivity/specificity with the third-party data, showing the great potential of the proposed method. The detailed features extracted from audio recordings are discussed with the relationship to autism screen and are compared among both the in-house data and the third-party data for further improvements.