Talker Expectations: Top-Down Information Integration during Speech Perception in ASD

Thursday, May 11, 2017: 12:00 PM-1:40 PM
Golden Gate Ballroom (Marriott Marquis Hotel)
A. Hogstrom, J. J. Green, B. Castelluccio, A. R. Canfield, C. Irvine and I. M. Eigsti, Department of Psychological Sciences, University of Connecticut, Storrs, CT
Background: The acoustic realization of phonemes differs substantially between talkers[1]; talker normalization refers to the process of determining the appropriate talker-specific mapping from acoustics to phonological categories. A 2007 study[2] tested whether talker normalization involves top-down constraints or is purely signal-driven. Subjects heard words produced by synthetic talkers with identical voices save except for a 10Hz F0 difference. If subjects believed that they would hear two talkers, they were reliably slower in a word-monitoring task when the talker varied randomly from trial-to-trial, compared to when trials were blocked by talker. In contrast, listeners who expected one talker showed no slowing, indicating that expectations sufficed to trigger talker normalization. Here, we asked whether individuals with ASD would exhibit this expectation effect, in light of evidence that such individuals have reduced susceptibility to top-down expectations [3]. 

Objectives: Assess the relative influences of talker variability and top-down expectations on speech processing in ASD.

Methods: We compared adolescents with typical development (TD) (n = 15) and ASD (n = 15), matched for age (M=15, range=12-17 years) and IQ (FSIQ>85). Using materials from Magnuson and Nusbaum, 2007[2], participants monitored a stream of auditory words for a variable target (e.g., ball, cave). Stimuli were monosyllabic words produced by synthetic “talkers:” male voices (otherwise identical) with F0=150 or F0=160. Within single-talker condition blocks, all targets and distractors were produced by one talker; in mixed-talker blocks, the talker changed randomly from word to word. There was a between-subjects manipulation of expectation; given identical stimuli, some subjects were told to expect one talker with variable pitch; others expected two talkers differing in pitch. Assignment to the two expectation conditions was counterbalanced by group.

Results: Analyses included age and NVIQ as covariates. Overall accuracy was high across groups (range in the ASD group, .80-.98; TD group, .84-.97), with a trend for lower accuracy in the ASD group, p=.05; thus, accuracy was a covariate in subsequent analyses. Reaction time (RT), a more sensitive index of cognitive processing or load, did not differ by group, p=.58. There was a near-significant Group by Expectation by Block interaction, F(1, 23)=3.76, p=.06. The TD group was slower for mixed-talker blocks when they expected two talkers; when they only expected a single talker, the difference in voices did not lead to slower RT (Fig 1). The ASD group did not show this effect. While expecting to hear two talkers rather than one was associated with slower RT in the ASD group (Fig1), this effect may reflect individual RT differences in this small-n between-subjects study.

 Conclusions:  In TD individuals, expectations about the number of talkers influenced RT for mixed versus single-talker blocks; this effect was less apparent in the ASD group. These results suggest that speech comprehension is more signal-driven and less influenced by top-down expectations in individuals with ASD. If this finding is replicated, it could illuminate some of the conversational deficits that characterize autism.