AUT-Priori: A Web-Based Tool for Autism Gene Prioritization

Poster Presentation
Friday, May 11, 2018: 5:30 PM-7:00 PM
Hall Grote Zaal (de Doelen ICC Rotterdam)
A. J. Griswold1, D. Van Booven1, E. R. Martin1, M. L. Cuccaro1, J. P. Hussman2 and M. A. Pericak-Vance1, (1)John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, (2)Hussman Institute for Autism, Catonsville, MD
Background: Autism is highly heritable with a complex genetic etiology. Large scale genome and exome sequencing studies have implicated hundreds of genes with a range of genomic variation from common variants of small effects to pathogenic de novo coding loss of function variants. However, as the pace of functional genomic analyses in autism increases, such as large scale transcriptomic and epigenomic studies, a new challenge arises of defining which of the many possible candidate genes are truly related to autism but not previously identified with genetic variants. Reducing these sets of genes to a subset for critical follow-up examination can presently be a time-consuming and laborious task.

Objectives: To create a publicly available web-based resource to aide in prioritization of autism biology related genes utilizing tissue, cell specific, and developmental gene expression patterns as well as available genomic data.

Methods: The data underlying this application is populated from existing publicly available resources including adult tissue specific gene expression (GTEx), developmental and spatial time course of brain gene expression (Brainspan and Human Brain Map), and gene expression profiles of single neuronal cell types (scRNA-seq). Furthermore, we have retrieved genomic variation-based priority metrics including probability of intolerance from the Exome Aggregation Consortium and the Gene Scoring metrics from the SFARI Human Gene database. The expression and genomic variation data then feed into a support vector machine (SVM) that requires a background training dataset. By default this background training data is the SFARI VIP gene list, but a custom training set can also be user defined. Based on the output of the SVM, the genes are then prioritized in terms of likely genetic impact based on similarity to previously identified genes. This tool is implemented in an R environment based on the kernlab package (v0.9-25) with a graphical component coded in R Shiny. An alpha release is currently available at https://umiami-hihg-bioinf.shinyapps.io/autism-query-tool/

Results: Implementation of these techniques in a user-friendly environment produces a unique resource for the autism genetics research community. While each of the underlying data repositories are publicly available, AUT-PRIOIRI-GENE offers a synthesis of these in a single location. Furthermore, it is highly customizable as the user is able to enter a list of gene symbols (manually or by file upload), select tissue, region, or gender specificity from data sources where available, and define a training dataset for prioritization. The resulting hierarchical clustering and clustered expression heatmaps, annotation tables, and prioritization scores are all downloadable from the server. While optimization continues, this process takes less than a minute for a gene set of 200 genes.

Conclusions: AUT-PRIORI-GENE provides an easily accessible tool to prioritize a most likely autism-related gene from a user defined list. This is a critical downstream step for those engaging in high-throughput functional genomics studies to aide in unraveling the complex underlying genetics of autism.

See more of: Genetics
See more of: Genetics