Whole Genome Sequencing in Autism

Scherer, Stephen W.

Background:

Autism Spectrum Disorder (ASD) is heterogeneous, both phenotypically and in its genetic architecture. There are now hundreds of genes found associated with ASD, with risk contributed by multiple types of rare and common genome-wide variation. Some individuals carry single rare (de novo or inherited) penetrant gene alterations. Others have multiple variants, and for these and others with ASD, a whole host of (poly)-genic risk factors may be involved.

Objectives:

Whole genome sequencing (WGS) technology has been launched worldwide by large-scale projects to study thousands of families from ASD cohorts and biobanks (e.g. AGRE, SFARI). The goal of this research is to decode entire genome sequences including all their genetic variants, link to available phenotype data, and make these massive genomic/phenotypic datasets available for scientific study. Before this, to find the complete spectrum of variants has required the incremental technologies of karyotyping, microarray, panel-sequencing and exome-sequencing. With WGS, an experiment costing about US$1000 can complete the task in a single comprehensive step.

Methods:

WGS is usually performed on DNA from whole blood (or sometimes other tissues), typically using ‘short-read’ sequencing technologies. Other useful data are arising from ‘long-read’ technologies. Bioinformatic pipelines are applied to raw sequence data to enable the robust identification of constitutional single nucleotide variants (SNVs), small insertion/deletions (indels), as well as copy number variants (CNVs), structural variants (SV; including short repeats), and mitochondrial variants. New algorithms further differentiate mutations occurring somatically from those present in all cells. Following approval of a Data Access Committee, primary and processed genomic data are then placed into different cloud- and web- based formats, to be accessible to the community.

Results:

WGS sequencing has identified new variants in protein-coding and non-coding (e.g., lncRNA, regulatory) regions, in or near genes missed by other technologies. A more complete view of the entire genome can increase the yield of findings relevant for ASD or its co-morbidities and provide context for their interpretation. Smaller CNVs and more complex SVs, often missed by other technologies, also contribute significantly in the etiology of ASD. Approximately 100 gene and CNV regions currently have value for testing in ASD diagnostics. Through data consistent across studies, we find genes involved in synaptic and neural adhesion, neural transcriptional regulation, and RNA processing to be involved in ASD, offering new entry points for drug development. Hundreds of scientists from around the world are using WGS data to enable their research studies.

Conclusion:

WGS data with accompanying phenotype information can greatly enable basic and clinical research in ASD. As the costs continue to decrease and WGS eventually moves into the diagnostic setting - supplanting microarray and exome-testing - thousands of additional genomes will become available for comparative analysis. The presentation will provide an overview of results of the major published studies, the data and resources available, and the most significant scientific advances, all based on WGS.

33057 Whole Genome Sequencing in Autism

33057
Whole Genome Sequencing in Autism