Genome Sequencing of 2,064 Genomes from 516 Simplex Families with Autism

Saturday, May 13, 2017: 1:15 PM
Yerba Buena 10-14 (Marriott Marquis Hotel)
T. Turner1, B. P. Coe1, B. J. Nelson1, M. C. Zody2, F. Hormozdiari1, Z. N. Kronenberg1, S. A. McClymont3, P. A. Hook3, K. Hoekzema1, M. H. Duyzend1, A. Raja1,4, C. Baker1, R. Bernier5, A. S. McCallion3, R. B. Darnell2,6 and E. E. Eichler1,4, (1)Department of Genome Sciences, University of Washington, Seattle, WA, (2)New York Genome Center, New York, NY, (3)McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, (4)Howard Hughes Medical Institute, University of Washington, Seattle, WA, (5)University of Washington Autism Center, Seattle, WA, (6)Howard Hughes Medical Institute, Rockefeller University, New York, NY
Background: The Simons Simplex Collection contains ~2,500 simplex autism families and has been previously studied through microarray and whole exome sequencing approaches. These studies have identified de novo and inherited risk factors that contribute to ~20-30% of autism and primarily affect the coding sequence of the genome. The remaining genetic risk factors for autism are currently unknown.

Objectives:  To understand the genetic etiology for cases not due to de novo gene disruptive events, our hypothesis is that these individuals with autism have variants within the exome and previously missed by other approaches and / or are enriched for variants in noncoding, regulatory DNA.

Methods:  We performed deep (30-fold) Illumina whole-genome sequencing (WGS) on 2,064 genomes from 516 simplex autism families negative for de novo likely gene-disruptive (LGD) mutation or large (>100 kbp) copy number variants (CNV). Using a hybrid cloud / local compute analysis workflow we processed all genomes in one month. We applied two SNV/indel and four CNV callers to generate a sensitive variant call set of 59 million SNVs/indels and 193 thousand unique CNVs. Extensive, orthogonal experimental validation was undertaken to determine the inheritance status of high-impact variants.

Results:  WGS analysis recovered ~25% more de novo, exonic SNV/indels and ~85% more gene-disrupting CNVs than previously discovered by whole-exome sequencing (WES) analysis of the same samples. This included de novo LGD events in GLIPR1L2, PHIP, PCM1, MED12L, and VARS and gene-breaking de novo CNVs in CHD2, DDX43, DMD, FANCA, LINC01347, LNPEP, MIR3129, MUC19, PCDHB17, PCDHB6, TAF1B, and ZNF462. We observed a significant enrichment (p=0.01) of de novo missense mutations in children with autism when compared to their unaffected siblings for autism risk genes with known dosage sensitivity. This included proband-specific events in UBE3C, PTEN, SUV420H1, CREBBP, LAMC3, GABRB3, SYNGAP1, NR3C2, SRCAP, TRIP12, UNC45B, SCN2A, POGZ, and TRIO. We also identified an enrichment of de novo SNVs/indels in 5' and 3' UTR events (p=0.02) and in transcription factor binding sites (p=0.03). We report a modest enrichment (p=0.04) of de novo and private disruptive mutations for putative regulatory elements for dosage-sensitive autism genes. We define these as fetal central nervous system (CNS) DNase I hypersensitive sites mapping within 50 kbp of the start and end of the candidate gene transcript. Functional testing of the regions affected by these events confirms that the CNVs enrich in enhancers within the central nervous system.

Conclusions:  Overall, WGS provides additional insight into the genetic etiology of autism by significantly increasing the yield of gene-disrupting mutations and by providing access to noncoding portions of the genome which when deleted adversely affect dosage of autism-risk genes during development.

See more of: Gene Discovery in ASD
See more of: Genetics