17438
Integrated Analyses of Genome Wide Association and Targeted Sequencing Data Identify Loss of Function and Noncoding Regulatory Rare Variants Contributing to Autism Spectrum Disorder

Friday, May 16, 2014: 4:30 PM
Marquis A (Marriott Marquis Atlanta)
A. J. Griswold1, N. D. Dueker1, D. Van Booven1, J. A. Rantus2, J. Jaworski1, S. H. Slifer1, M. A. Schmidt1, W. F. Hulme1, I. Konidari1, P. L. Whitehead1, S. M. Williams3, R. Menon4, M. L. Cuccaro1, E. R. Martin1, J. L. Haines5, J. R. Gilbert1, J. P. Hussman6 and M. A. Pericak-Vance1, (1)John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, (2)Hussman Institute for Human Genomics, University of Miami, Miami, FL, (3)Center for Human Genetics Research, Vanderbilt University, Nashville, TN, (4)Rollins School of Public Health, Emory University, Atlanta, GA, (5)Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, (6)Hussman Foundation, Ellicott City, MD
Background:  Genome-wide association studies (GWAS) and exome sequencing have found no single common or rare factor accounting for the genetic risk for autism spectrum disorders (ASD). This leads to the hypothesis that genetic variants in hundreds of genes and nongenic loci contribute to ASD.  Identification of such rare risk variants by sequencing large cohorts for rare or low frequency variants with potential functional significance to ASD is essential.

Objectives:  To identify rare coding and noncoding ASD risk variants integrating GWAS association with custom targeted massively parallel sequencing of candidate regions.

Methods:  Candidate regions were chosen from GWAS Noise Reduction analyses of two autism datasets (Hussman et. al., 2011). We sequenced 17Mb on the Illumina HiSeq2000 utilizing a custom Agilent SureSelect probe-set targeting exons, UTRs, conserved intronic areas, and regulatory regions of 681 GWAS-NR associated genes and conserved regions in 694 associated intergenic loci. Our cohort consisted of 2,112 ASD cases and 834 controls of white ethnicity determined by Eigenstrat. Bioinformatic processing included alignment with bwa, genotype calling with the GATK Universal Genotype Caller, and annotation of coding variants with SeattleSeq137, PolyPhen2, and SIFT and noncoding variants with ENCODE, VISTA, and GENCODE databases.  We first identified rare (MAF≤0.01) loss-of-function (LOF) alterations (stop gains-losses, splice changes, and frameshifts) or genes with multiple damaging variants in the same individual, and noncoding variants predicted likely to affect binding by RegulomeDB.  For LOF variants in ASD candidate genes, we determined inheritance status of the variant by Sanger sequencing. We used the Sequence Kernel Association Test (SKAT) for gene and noncoding element based association testing between sets of rare variants and ASD.

Results:  We identified 547,716 single nucleotide variants (SNVs) and 94,326 indels, including 29,734 coding variations and 2,653 rare LOF alterations. No overall difference existed between cases and controls in the number of variations or multiple hit genes, but there was significant enrichment in cases when restricting to LOF variants in ASD candidate genes (p=0.003). Among these was a confirmed de novo premature stop in the candidate gene RBFOX1.  RegulomeDB predicted 55 case unique noncoding variants to affect transcription factor binding motifs. Two were in the same glutamate receptor gene, GRIK4, altering SP1 and CTCF sites. Association testing identified only nominally significant associations with rare exonic variants ASD in 30 genes including the transcription factors ZNF24  (p=0.00192) and ZNF519 (p= 0.00465) and nominal significance with rare SNVs in enhancer hs658 (p=0.07) that drives expression in the midbrain of mouse embryos.  Further analyses of subsets of variations are ongoing.

Conclusions:  The identification of an enrichment of LOF variants in ASD candidate genes corroborate literature from other ASD datasets and identification of a de novo LOF in RBFOX1 further establishes its importance as an ASD gene. As the first large scale study of noncoding variation in ASD, this investigation adds noncoding variants to the compendium of ASD risk loci and implicate potential novel biological mechanisms contributing to ASD etiology.  Overall, this study highlights integration of association and sequencing data to find rare ASD risk alleles.

See more of: Genetics
See more of: Genetics