Voice Patterns in Adult English Speakers with Autism Spectrum Disorder
Individuals with Autism Spectrum Disorder (ASD) often display atypical modulation of speech described as awkward, monotone, or sing-songy (Shriberg et al., 2001). These patterns are a robust signal of social communication deficit (Paul et al., 2005) and contribute to reaching a diagnosis of ASD. Using Recurrence Quantification analysis of acoustic features, Fusaroli et al. (2013 ; IMFAR 2014) demonstrated a high efficacy of identifying voice patterns characteristic of adult Danish speakers with Asperger’s syndrome and trained machine learning algorithms to accurately (80-86%) discriminate autistic from non-autistic speakers in both adult and children Danish speakers with ASD.
Our first aim was to replicate the results obtained by Fusaroli et al. (2013, 2014) in a sample of English speakers, i.e. (1) characterise the speech patterns of adults with ASD and (2) employ the results in a supervised machine-learning process to determine whether acoustic features predict diagnostic status and severity of the symptoms.
In addition we were interested to evaluate how valid the model built based on Danish data would be on English data, i.e. which parameters were dependent and which were independent of the speakers’ language.
The context of a previously published study of memory in ASD (Maras et al., 2013) provided audio recordings of 17 ASD and 17 matched Typically Developing (TD) adults attempting to recall details of a standardised event they had participated in. Transcripts were time-coded, and pitch (F0), speech-pause sequences and speech rate were automatically extracted. We conducted traditional statistical analysis on each prosodic feature. We then extracted non-linear measure of recurrence: treating voice as a dynamical system, we reconstructed its phase space and measured the number, duration and structure of repeated trajectories in that space (Marwan et al., 2007). The results were injected to train a linear discriminant function algorithm to classify the descriptions as belonging either to the ASD or TD group. The model was developed and tested using 1000 iterations of 10-fold cross-validation (to test the generalizability of the accuracy) and variational Bayesian mixed-effects inferences (to compensate for biases in sample sizes).
Preliminary analysis of a sample of English speakers suggest similar results to those obtained in a Danish population: individual with ASD produce highly regular speech patterns organized in short sequences (200-400 ms) being frequently repeated, which support clinical reports of monotony. While features are similar across Danish and English language, the coefficients discriminating individuals with ASD and controls need to be re-trained.
The current data suggest than ASD adults produce highly regular patterns of speech (as measured by pitch and pause distribution). Importantly this provides a quantifiable measurement to capture some of the clinical reports which contribute to reaching a diagnosis of autism. Further analysis will establish whether voice patterns can be a tool in the diagnostic process or to follow language development in an individual and reliably distinguish autistic- and non autistic-like speech.