Advancements in Automated Visual & Speech Analysis of ASD Symptom Domains

Tősér, Zoltán

Background: The enormous promise of emerging machine learning applications in clinical/research settings relies on the prioritization of both technology and clinical science. This exhibit presents advancements in our application of automated AI methods to detect and monitor social-communication behaviors associated with ASD and related neuropsychiatric conditions.

Objectives: To evaluate feasibility (e.g. detection), data processing (e.g. time, variables) requirements, and clinical correlates of automated visual and speech analytic tools during clinical evaluations. This presentation expands the prior evaluation of eye gaze and facial expressions to additional domains including gestures, repetitive behaviors, and speech.

Methods: Data are presented on 48 individuals with ASD (mean=9.3 years, SD=3.1). Stimuli were collected with off-the-shelf 2D cameras and microphones, and Tobii Pro Glasses 2 worn by the examiner during the ADOS-2 (Modules 1-3). Software estimated facial, body, and speech landmarks, identified patients, analyzed target behaviors and generated detailed behavior metrics. For the purposes of this analysis, metrics included frequency, duration, initiation, responsiveness, and percentage of time. These metrics were collected for gaze, facial expressions of emotion, nodding, pitch, touching, and mouthing. For pitch, standard deviation was used as a metric of monotone, “robotic” speech. Additionally, data on recently added metrics which includes pitch, nodding, and sensory stimulation were compared in age matched, verbally fluent children with and without ASD (n_ASD=8; n_TD=8).

Results: Metrics derived from visual and speech analysis software were associated with respective item level data on the ADOS2. In the ASD sample, negative valence emotions were moderately and significantly correlated with Facial expressions directed to examiner (Spearman’s rho=-.393, p=.018). Several gaze metrics were significantly correlated with Unusual Eye Contact including responsive gaze (Spearman’s rho=-.436, p=.011), patient looking at clinician with (Spearman’s rho=-.457, p=.007), and initiated gaze (Spearman’s rho=-.419, p=.015). Comparisons across developmentally matched controls suggest machine learning methods identified unusual and atypical behaviors, and these behaviors distinguished ASD from matched TD samples. Specifically, higher levels of repetitive behaviors including touching (z=-2.17, p=.030) and mouthing (z=-1.852, p=.064). Of interest, results suggest reduced pitch range in the ASD sample relative to age-matched peers.

Conclusions: Computerized assessments are engaging, adaptive, and provide nuanced output. Our results continue to support the promise of training machine learning models to not only identify behavioral signatures characteristic of ASD but also measure novel phenotypic markers (e.g. pitch). In addition to capturing social communication skills, these results support the software’s capacity to capture repetitive behaviors. Overall, findings support initial feasibility of a minimally intrusive protocol for automated analysis of core ASD symptoms with potential to support identification, treatment monitoring, and biomarker investigations of NDDs.

32123 Advancements in Automated Visual & Speech Analysis of ASD Symptom Domains

32123
Advancements in Automated Visual & Speech Analysis of ASD Symptom Domains