ERPs to Congruent Pictures and Sentences in Minimally Verbal Children and Adolescents with ASD

Poster Presentation
Saturday, May 12, 2018: 11:30 AM-1:30 PM
Hall Grote Zaal (de Doelen ICC Rotterdam)
E. Zane1, S. Kresh2, A. Benasich3 and V. L. Shafer4, (1)FACE Lab, Emerson College, FACE Lab, Boston, MA, (2)CUNY Graduate Center, New York, NY, (3)Center for Molecular & Behavioral Neuroscience, Rutgers Univ., Newark, NJ, (4)Speech-Language-Hearing Sciences, CUNY Graduate Center, New York, NY

Individuals with Autism Spectrum Disorder (ASD) are believed to struggle with multimodal processing and integration (Brandwein et al., 2013). Such deficits contribute to reduced speech discrimination in ASD (Smith and Bennetto, 2007; Stevenson et al., 2014), since one must rely on simultaneous audio-visual (AV) cues to differentiate sounds in noisy environments (Sumby and Pollack, 1954).

Not only do such lower-order language processes – e.g., phoneme differentiation – require multimodal integration, but so can higher-order processes, like semantic interpretation. When a sentence is spoken in the real world, information represented by the sentence often refers to objects/characters and events present in the visible environment. If an individual cannot simultaneously process AV information, she may struggle to connect linguistic information to her environment, and may miss opportunities for assigning visual referents to linguistic symbols when first acquiring language (Gleitman and Gleitman, 1992; Siskind, 1992).


To use Event Related Potentials (ERPs) to determine how minimally verbal (MV) children and adolescents with ASD process auditory sentences while viewing contextually congruent photographs.


We recruited 9 MV children and adolescents with ASD (3 F; Mean age = 7;4) and 9 age-matched neurotypical (NT) peers (2 F; Mean age = 7;4). We also recruited one adolescent with ASD with preserved language/cognitive abilities (VASD) and his NT twin (age=12;9). We showed these 20 participants 224 photographs of animals performing an action (e.g., a frog jumping); 300ms later, a congruent sentence was played (e.g., “the frog jumps”). Photographs remained on-screen while sentences played, and we recorded ERPs from photograph onset.

Averaged activity in posterior electrodes between 400-600ms post-picture onset was used to explore group differences in the Late Posterior Positivity response (LPP), an ERP implicated during the processing of complex pictures (Ferrari et al., 2008). Paired t-tests compared LPPs in individuals with ASD to NT matches.


In all NT participants, an LPP was elicited, while only one participant with ASD showed such a response (p < 0.001). The participant with VASD did not show an LPP.


NT individuals show large LPP responses to photographs, suggesting that they are attending to the photographs while listening to the sentences. The amplitude and duration of LPPs are similar to posterior positivities elicited by matching AV information in previous studies (Molholm et al., 2004), indicating that the congruency of AV information in our study may have amplified the LPP in NT individuals.

The vast majority of our ASD cohort showed no LPP. Most showed visual obligatories, indicating they initially processed photographs similarly to the NT group; lacking LPPs suggest they stopped attending to photographs once encountering auditory information. Because the child with VASD also showed this pattern, this response might be particular to ASD, rather than dependent on language ability. Future research should test this possibility. Our result accords with speech-perception research showing that individuals with ASD predominantly focus on auditory information when perceiving simultaneous audio-visual cues (Brandwein et al., 2013). Results have far-reaching implications for the on-line processing and acquisition of language meaning in ASD.