17913
Scene Content Influences Dynamic Visual Scanning of Toddlers with and without ASD during Viewing of Naturalistic Videos

Thursday, May 15, 2014
Atrium Ballroom (Marriott Marquis Atlanta)
G. A. Marrinan1, S. Shultz2, A. Klin3 and W. Jones2, (1)Marcus Autism Center, Children's Healthcare of Atlanta & Emory University School of Medicine, Atlanta, GA, (2)Department of Pediatrics, Marcus Autism Center, Children's Healthcare of Atlanta, Emory University, Atlanta, GA, (3)Marcus Autism Center, Children's Healthcare of Atlanta and Emory University School of Medicine, Atlanta, GA
Background: When faced with complex, dynamically-unfolding social scenes, typically-developing (TD) toddlers effectively synchronize their viewing, converging on common locations more often than expected by chance. When viewing the same scenes, the visual resources of toddlers with ASD also converge significantly, but the locations and timing of this convergence differ relative to TD children. Previous research examining what drives visual attention of children with and without ASD has focused primarily on summary information about overall fixation to discrete regions-of-interest, such as eyes, mouths, bodies or objects, divorced from their dynamic social context. However, the importance and meaning of these regions-of-interest varies with the complex, ever-changing social interactions of which they are a part. In order to unpack how complex scene content drives visual attention, our laboratory has recently catalogued the occurrences of functionally meaningful social actions and physical elements within naturalistic videos. That investigation revealed that particular onscreen events, including emotional facial expressions and high amplitude vocalizations, were associated with significantly greater convergence among TD toddlers. However, factors guiding visual attention in ASD remain unknown.  

Objectives: This study examines whether mechanisms driving attention in TD viewers (e.g. facial expressions and vocalizations of varying affect) guide or fail to guide visual convergence in toddlers with ASD.  

Methods: Eye-tracking data were collected as TD toddlers (N = 44) and toddlers with ASD (N = 22) viewed naturalistic videos of peer interactions. Children were matched on chronological age and non-verbal function. We used kernel density estimation to quantify the level of convergence of visual scanning at each moment in time for each group separately. In parallel, we coded the occurrence of facial expressions (positive, neutral, and negative), and vocalizations (high and low amplitude) at each frame of the videos. We then used a generalized linear model to investigate how well each coded onscreen behavior predicted the level of convergence of visual scanning in TD toddlers and toddlers with ASD.   

Results: TD toddlers’ viewing patterns converged significantly 58.13% of the time, while toddlers with ASD converged significantly 28.36% of the time. Results of the GLM revealed that negative facial expressions (t(42) = 2.04, p = 0.048), high amplitude vocalizations (t(42) = 10.37, p < 0.001), and low amplitude vocalizations (t(42) = 2.66, p = 0.011) predicted convergence among TD viewers. By contrast, only negative facial expressions (t(15) = 3.24 , p = 0.005) and low amplitude vocalizations (t(15) = 3.93 , p = 0.001) predicted convergence among viewers with ASD.  

Conclusions: These findings provide an important step towards identifying specific scene content that guides or fails to guide visual attention in viewers with ASD. Future analyses will examine visual convergence with respect to a wider range of coded onscreen events. In addition, we will examine the broader context within which these onscreen events occur and the level of engagement of each viewer in relation to these onscreen events to further investigate what drives visual engagement in toddlers with ASD.