22102
Mssng - Combining Open Science and 10,000 Whole Genome Sequences to Speed the Delivery of New Understandings in Autism

Friday, May 13, 2016: 11:30 AM-1:30 PM
Hall A (Baltimore Convention Center)
M. T. Pletcher, Autism Speaks, Boston, MA
Background:  Genetics have been successfully leveraged to provide critical insights into the molecular pathology and heterogeneity of Autism Spectrum Disorder (ASD).  Yet, there are critical gaps in our understanding of the disorder that prevent efficient diagnosis and treatment.  In an effort to speed the delivery of new discoveries and therapeutics, Autism Speaks, along with its key partners, Google and the Hospital for Sick Children have initiated the MSSNG program.

Objectives:  Through MSSNG, we have endeavored make available to a broad community of researchers, clinicians, diagnostic laboratories, and educators at least 10,000 whole genome sequences from members of families with ASD, the associated clinical and phenotypic data, and the tools necessary to fully explore and analyze that data.

Methods:  After completion of the sequencing, raw genome sequences are placed on Google Cloud Platform where they are reviewed for quality and processed through a annotation pipeline that runs off the Google Genomics API.  As soon as this process is completed and permission is received from the appropriate institutional review board or research ethics board, the annotated genomes are published for use by approved investigators. Any investigator with a legitimate research question can gain access to the MSSNG database by applying through a process that is managed by an independent Data Access Compliance Office and is consistent with the donors’ informed consent.

Results:  Nearly 5000 genomes have been completed to date and all 10,000 genomes are anticipated to be available to approved researchers by the spring of 2016.  A track for the UCSC browser has been created so that even those that have not gone through the MSSNG access process can still view the variants identified in the MSSNG dataset and their associated frequencies.  Although not originally available, approved researchers can now download BAM files so that they might be able to do custom analyses locally.  MSSNG has also provided a web-based interface to allow individuals to identify specific variants, generate gene-specific variant lists, conduct statistical analysis, and browse individual phenotype data.  Although still early, investigation of the MSSNG dataset has led to important insights into the role of de novo mutations in ASD and has identified novel genetic variants as potential causative factors in the disorder.

Conclusions:  MSSNG provides an open and flexible environment to study the largest genomic resource of its kind.  It provides a critical reference dataset for genetic investigations in any number of unrelated diseases while powering efforts to define discrete but clinically meaningful subcategories of ASD.  Ultimately, the success of MSSNG will be determined by the value it brings to the families that made MSSNG possible through their donation, by impacting their journey with ASD and improving outcomes.

See more of: Genetics
See more of: Genetics