Data Harmonization across Genetic Conditions Associated with Autism

A. Halladay1,2, H. Grosman3, J. Acampado4 and J. Tjernagel4, (1)Autism Science Foundation, New York, NY, (2)Rutgers University, Piscataway, NJ, (3)Seaver Autism Center, Department of Psychiatry, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, (4)Simons Foundation, New York, NY

Advances in genetic testing have resulted in the discovery of rare variants that increase the risk for autism spectrum disorder (ASD) and related neurodevelopmental disorders. Individuals with these specific mutations have an up to 70% penetrance of ASD. Understanding ASD with a distinct genetic etiology is critical to developing interventions for both syndromic and idiopathic ASD. Families affected by these rare disorders have established Patient Advocacy Groups (PAGs) that have developed and maintained research registries that are crucial to this effort. The registries provide researchers important data, help communicate information to families, and help with participation in research studies. These PAGs have come together to form a consortium to address common issues, including registry-based data, in a group called AGENDA (Alliance for the Genetic Etiologies of Neurodevelopmental Disorders and Autism).


The Autism Science Foundation and the Simons Foundation Variation in Individuals (Simons VIP) project sought to enhance harmonization of data across rare disorder registries and allow for improved access and utilization of these registries to understand commonalities and differences across different forms of autism


A total of 8 registries participated in this project and submittedfull data dictionaries. They included data dictionaries from the Autism Treatment Network, Interactive Autism Network, Dup15q Alliance, National Fragile X FORWARD, National Database for Autism Research, Phelan-McDermid Syndrome Foundation, Simons VIP and the Tuberous Sclerosis Alliance. Based on feedback from families, scientists and industry, questions across different domains were organized and categorized. Domains included: demographics, developmental history, pregnancy and birth history, and neurological or seizure history.


The information was itemized and made publicly available along with data use guidelines for each dataset. The information can be found on a single spreadsheet via the AGENDA website at: https://www.gdaac.org/for-scientists. The comparison highlights questions within each topic that exist in a single registry as well as those that appear similarly in multiples registries. Recommendations for wording and answer coding were made for both existing registries looking to update data formats as well as new registries wishing to design useful data structures at the outset. Within these high priority areas, 90 of the 1277 questions collected were itemized, following a protocol such that additional topic areas can be examined in the future.


One of the barriers to research across different forms of autism, both syndromic and idiopathic, is the variation in resources and diversity of data collected through research registries. This project sought to decrease these difficulties by highlighting and harmonizing questions within special topics of interest across different registries. Future directions include determining individual governance policies to facilitate a unified data access process. As more rare variants considered causative for autism are identified, ensuring consistent resources to support scientific research is necessary.Increased collaboration between research registries and Patient Advocacy Groups should be encouraged so that standardization of the data can be initiated at earlier stages of clinical research studies, mirroring the approach taken within clinical trial research and the alignment with CDISC (clinical data interchange standards consortium).