Examining Test-Retest Reliability of the Autism Diagnostic Observation Schedule (ADOS) Calibrated Severity Score (CSS)

Poster Presentation
Thursday, May 2, 2019: 5:30 PM-7:00 PM
Room: 710 (Palais des congres de Montreal)
Y. B. Choi1, H. R. Thomas1, D. Janvier1, C. Lord2 and S. H. Kim3, (1)Psychiatry, Weill Cornell Medical College, White Plains, NY, (2)University of California Los Angeles, Los Angeles, CA, (3)Psychiatry, Center for Autism and the Developing Brain, White Plains, NY
Background: Calibrated severity scores (CSS; Gotham, Pickles & Lord, 2009) of the Autism Diagnostic Observation Schedule (ADOS; Lord et al., 1999, 2012) provide a measure of autism symptom severity that is less influenced by developmental characteristics and allow for a more valid comparison of scores across different Modules when compared to the ADOS raw algorithm totals. Separate CSS for the ADOS social affect (SA) and restricted and repetitive behavior (RRB) domains have also been created (Hus, Gotham & Lord, 2015). Although test-retest reliability for the ADOS algorithm total scores were found to be strong in previous studies (Lord et al, 1989; Gotham et al, 2007; Brugha et al., 2012; McCrimmon & Rostad, 2013) and CSS were stable over time in a recent meta-analysis of 40 studies (Bieleninik et al., 2017), test-retest reliability of the ADOS CSS has not been directly tested.

Objectives: We aim to examine the test-retest reliability of the ADOS CSS across all Modules of the ADOS.

Methods: Repeated ADOS assessments were gathered for all Modules: Toddler Module included 302 observations from 75 children (48 ASD cases; Mean age=20 months, SD=4.7); Module 1 included 72 observations from 30 children (29 ASD; Mean age=37 months, SD=13.2); Module 2 included 88 observations from 31 children (26 ASD; Mean age=34 months, SD=10.9); Module 3 included 120 observations from 57 children (35 ASD; Mean age=8 years, SD=2.8); and Module 4 included 38 observations from 19 adolescents/adults (19 ASD; Mean age=21 years, SD=4.7). Considering the variability in developmental effects across different age and language groups, the duration between test-retest observations was 2 months on average for the Toddler Module, Module 1 and 2, 4 months for Module 3 and 8 months for Module 4. Absolute Intraclass Correlation Coefficients (ICCs) for test-retest reliability were calculated for CSS total, CSS SA and CSS RRB.

Results: ICCs for all Modules fell in the moderate to high range. The overall CSS showed the highest ICCs ranging from .71 (Module 2) to .87 (Toddler Module). The ICCs of CSS SA ranged from .64 (Module 4) to .88 (Toddler Module). The ICCs CSS RRB were lower, but still in moderate rages, from .58 (Module 4) to .68 (Module 2). All these values were significant with p-values less than .05. ICCs for different age and/or language groups are reported for each Module in Table 1.

Conclusions: Our results demonstrate moderate to high test-retest reliability and low measurement error of the ADOS CSS across all Modules. Such precision of the ADOS CSS suggests minimal variability in CSS obtained from ADOS sessions repeated within relatively short periods of time. Results support the use of the ADOS CSS as a reliable and accurate tool to track trajectories of ASD symptoms over time that can be utilized in research and clinical settings.