22377
Whole Genome Sequencing and Identical By Descent Filtering of Autism Spectrum Disorder Extended Families Reveals Novel ASD Risk Variants
Objectives: Our study applies WGS to extended, multiplex families with at least two affected cousins likely to carry rare, partially penetrant inherited alterations. We hypothesize that identical by descent (IBD) filtering in these large, multiplex pedigrees would define genomic regions of shared ASD risk and allow identification of coding variants missed by exome sequencing, functional variants in the 98% of the genome that is noncoding, as well as structural variation, to identify potential new ASD loci.
Methods: We performed WGS on at least two affected cousins across six ASD extended families (15 individuals). Sequencing was performed on the Illumina HiSeq2500 and analyzed through pipelines including BWA-MEM alignment, quality recalibration by GATK, and variant calling with the GATK HaplotypeCaller. Structural variants (SVs) were called with the SWAN algorithm. Annotations were applied with ANNOVAR including functional predictions for noncoding variants (GWAVA, CADD, FATHMM-MKL). We determined IBD regions using whole genome genotyping data and the MERLIN package and used these regions to filter variations for each family. Variants were prioritized by sharing in all affected individuals per family, rarity of the variant in the population (< 1%), and evidence of functionality from computational predictions.
Results: We sequenced each genome to ~40X coverage and identified more than four million single nucleotide variants (SNVs) and small indels and more than 100 SVs per individual. IBD and sharing filtering within each family limited the total number of SNVs for analysis to 289-655,355, depending on family structure. Among coding SNVs, ~94% concordance was found with existing WES (Cukier, et al, 2014), however WGS identified ~10% more coding variant calls. These include a family with a rare missense mutation in the neurogenesis growth factor GDF11 and another with a frameshift in the axonal development gene SLAIN1. Impact of noncoding variation is continuing to be evaluated, and has identified two shared likely functional variants in the putative promoter of the ASD candidate gene CNTN4 in one family and a two others upstream of potassium channel KCTD1 in another. Finally, rare copy number variants disrupting the promoter of the neurodevelopmental WWOX gene as well as deleting an exon of the lincRNA FIRRE involved in chromosomal organization were found in single families.
Conclusions: By studying these unique pedigrees, applying cutting edge sequencing and analysis methods, and employing using IBD filtering we establish that WGS in extended families can be used to identify ASD risk alterations. Such methods extend the scope of ASD genetic risk beyond de novo protein coding variants to functional noncoding SNVs, SNVs not captured by WES, and SVs that might be conferring ASD Taken together, WGS identifies new ASD candidate genes and pathways.