Role of De Novo Intronic Indels in Autism

Saturday, May 13, 2017: 1:51 PM
Yerba Buena 10-14 (Marriott Marquis Hotel)
A. Munoz Jimenez, B. Yamrom, Y. H. Lee, P. Andrews, S. Marks, Z. Wang, M. Wigler and I. Iossifov, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
Background: Over the past several years, analysis of whole-exome sequencing and microarray hybridization data from collections of families of children with autism like the Simons Simplex Collection (SSC) has increased our understanding of the genetic architecture of autism. Contributions from de novo (DN) mutation, rare and common variants have been established, but most progress has been made in the study of DN variants. DN germ line likely gene-disrupting (LGD), missense and copy number variants have been estimated to jointly contribute to approximately 30 percent of the autism in simplex families (Iossifov et al., 2014). The observed DN mutations in the ~5,000 affected children enabled the identification of lists of few hundred genes that with high confidence are involved in autism’s etiology (autism genes). For example, half of the 546 genes targeted by published DN LGD mutations (LGD targets) are expected to be true autism genes.

Objectives:  Whole-genome sequencing data sets were recently generated from the SSC, and we sought to determine if the additional types of DN variants that can be detected from whole-genome sequencing data, including noncoding variants and complex structural rearrangements, also contribute to autism. In this study, we specifically explored the contribution of DN noncoding mutation by focusing on the introns of the autism genes identified by the exome sequencing.

Methods:  We analyzed whole genome data generated from the father, mother, a child with autism and an unaffected child of 510 families from the SSC chosen to have no DN LGDs or CNVs in the exomes of the affected children. The 150bp paired-end dataset was generated at the New York Genome Center with average depth of 30x. About 20,000 de novo intronic substitutions (DIS) and 2,000 de novo intronic indels (DII) were identified, using our multinomial genotyper with stringent cutoffs. We determined that the rate of false positives in this set is <5% and the rates of such de novo events was in agreement with published rates.

Results:  There was no significant difference between the numbers of all DIS and all DII between the affected and unaffected children. But in the introns of the 546 LGD target genes, we identified 63 DII in the 510 affected while we found only 37 in the 510 unaffected children. The difference of 26 events is significantly larger than 0, (p-values of 0.01). The significance increases if we restrict to the half of the LGD targets that are more intolerant to damaging variants in human population and are expected to be even more enriched for autism genes (p-value of 0.005) while the delta barely shrinks (from 26 to 23). There was no significant difference in the number of DISs in any of these sets and there was no significant difference of the numbers of DIIs or DISs in gene targets of DN missense or DN synonymous.

Conclusions:  From the observed increase in DIIs rates in children with autism, we estimated that DN intronic variants might contribute ~20% of the autistic children in simplex families.

See more of: Gene Discovery in ASD
See more of: Genetics