2013 International Meeting for Autism Research: Identifying Unexpected and Inappropriate Words in ASD Language Samples

Thursday, 2 May 2013: 14:00-18:00

Banquet Hall (Kursaal Centre)

14:00

E. T. Prud'hommeaux, M. Rouhizadeh, B. Roark and J. van Santen, Center for Spoken Language Understanding, Oregon Health & Science University, Beaverton, OR

Background: Idiosyncratic and atypical language is included in many diagnostic instruments, including the ADOS, ADI-R, and SCQ, as a diagnostic marker for autism spectrum disorder (ASD). Judgments of atypicality, however, rely primarily on impressionistic, real-time evaluation of language at the discourse level, which can lead to poor reliability across examiners and subjects. In this study, we use both manual assessment and automated language analysis of speech transcripts of narratives to identify instances of atypical and inappropriate language at the lexical level. We then use this information to distinguish children with typical development (TD) from children with ASD.

Objectives: The objectives of this work are the following: (1) to establish that children with ASD use unexpected or inappropriate words in their narratives; (2) to investigate methods of identifying such words automatically and objectively using existing language analysis technology; and (3) to determine whether these automated methods are an adequate substitute for manual analysis for distinguishing children with ASD from their TD peers.

Methods: Participants in this study included 37 children with TD and 21 children with ASD, who were diagnosed via clinical consensus according to the DSM-IV criteria and the established threshold scores on the ADOS and the SCQ. There were no significant between-group differences in age (mean=6.4) or full-scale IQ (mean=114). The Narrative Memory subtest of the NEPSY, in which a child hears a brief story and must retell the story to the examiner, was administered to each child. Each retelling was recorded and then transcribed. Two annotators, blind to diagnosis, identified every word in each retelling transcript that was unexpected or inappropriate given the context of the story. In order to identify such words automatically, we then calculated for each word the tf-idf score, which is based on a comparison between the frequency of that word in the child's retelling and the frequency of that word in a large corpus of retellings collected from neurotypical adults. A high tf-idf score indicates that a word is very unlikely or unexpected in that particular context.

Results: First, we found that children with ASD do, in fact, produce significantly more manually identified unexpected and inappropriate words in their narrative retellings (p < 0.05). Second, the set of words selected as unexpected using the automatic tf-idfscore corresponds very well with the set of words manually identified as unexpected (precision=84%, recall=53%). Finally, using the set of automatically identified words, we found again that children with ASD produce significantly more unexpected words than children with TD (p < 0.05).

Conclusions: These results demonstrate that identifying specific instances of atypical language at the lexical level can reveal patterns of language use that are characteristic of ASD. The word likelihood score presented here captures these patterns accurately enough to allow for automated analysis of language that can serve as a proxy for manual indentification of unexpected words. This work underscores the potential of automated techniques for improving our understanding of the linguistic features associated with ASD.

See more of: Core Deficits I
See more of: Core Deficits
See more of: Symptoms, Diagnosis & Phenotype

| More

Note: Most Internet Explorer 8 users encounter issues playing the presentation videos. Please update your browser or use a different one if available.

Identifying Unexpected and Inappropriate Words in ASD Language Samples