Genome Sequencing of 2,064 Genomes from 516 Simplex Families with Autism
Objectives: To understand the genetic etiology for cases not due to de novo gene disruptive events, our hypothesis is that these individuals with autism have variants within the exome and previously missed by other approaches and / or are enriched for variants in noncoding, regulatory DNA.
Methods: We performed deep (30-fold) Illumina whole-genome sequencing (WGS) on 2,064 genomes from 516 simplex autism families negative for de novo likely gene-disruptive (LGD) mutation or large (>100 kbp) copy number variants (CNV). Using a hybrid cloud / local compute analysis workflow we processed all genomes in one month. We applied two SNV/indel and four CNV callers to generate a sensitive variant call set of 59 million SNVs/indels and 193 thousand unique CNVs. Extensive, orthogonal experimental validation was undertaken to determine the inheritance status of high-impact variants.
Results: WGS analysis recovered ~25% more de novo, exonic SNV/indels and ~85% more gene-disrupting CNVs than previously discovered by whole-exome sequencing (WES) analysis of the same samples. This included de novo LGD events in GLIPR1L2, PHIP, PCM1, MED12L, and VARS and gene-breaking de novo CNVs in CHD2, DDX43, DMD, FANCA, LINC01347, LNPEP, MIR3129, MUC19, PCDHB17, PCDHB6, TAF1B, and ZNF462. We observed a significant enrichment (p=0.01) of de novo missense mutations in children with autism when compared to their unaffected siblings for autism risk genes with known dosage sensitivity. This included proband-specific events in UBE3C, PTEN, SUV420H1, CREBBP, LAMC3, GABRB3, SYNGAP1, NR3C2, SRCAP, TRIP12, UNC45B, SCN2A, POGZ, and TRIO. We also identified an enrichment of de novo SNVs/indels in 5' and 3' UTR events (p=0.02) and in transcription factor binding sites (p=0.03). We report a modest enrichment (p=0.04) of de novo and private disruptive mutations for putative regulatory elements for dosage-sensitive autism genes. We define these as fetal central nervous system (CNS) DNase I hypersensitive sites mapping within 50 kbp of the start and end of the candidate gene transcript. Functional testing of the regions affected by these events confirms that the CNVs enrich in enhancers within the central nervous system.
Conclusions: Overall, WGS provides additional insight into the genetic etiology of autism by significantly increasing the yield of gene-disrupting mutations and by providing access to noncoding portions of the genome which when deleted adversely affect dosage of autism-risk genes during development.