Objectives: To facilitate visualization and analysis of this large amount of data, we have created the Autism Database (AutDB), a publicly available, curated, web-based, and searchable database for genes linked to ASD. AutDB utilizes a systems biology approach by building a modular framework to integrate diverse types of evidence related to ASD risk genes. Current modules include: 1) Human Gene, which annotates all known autism-linked genes according to their genetic variation (genetic association studies, rare single gene mutations, and genes linked to syndromic autism)and relevance to autism; 2) Animal Model, which catalogs behavioral, anatomical, and physiological data corresponding to ASD-linked genes; 3) Protein-Protein Interaction, which builds interactomes based on all known direct relationships for protein products of ASD-linked genes; and 4) Copy Number Variant, which describes features of all reported copy number variants associated with ASD.
Methods: AutDB modules are integrated to allow cross-modal navigation of varied evidence related to ASD-linked genes. All data originates entirely from published, peer-reviewed scientific literature. Our researchers systematically search, collect, extract, and update data within AutDB modules based on multi-level annotation models. Notably, for the Animal Model module, we have developed a new standardized vocabulary of phenotypic terms designed to reflect the widely ranging clinical manifestations of ASD. Data is visualized in multiple formats for each module, ranging from hierarchical tables to chromosome ideograms and protein networks. We have also developed a tool called Workspace, a simulated environment in which a reference gene set from AutDB can be used along with the user's data for analysis without requiring the user to upload information into the database.
Results: As of December 2010, data content with AutDB modules consists of the following: 1) Human Gene: 230 entries based upon 140 primary research articles, 2) Animal Model: 79 entries encompassing 202 mouse models, 3) Protein-Protein Interaction: 84 entries containing a total of 1119 direct protein interactions, and 4) Copy Number Variant: 201 entries annotated from 22 primary research articles. The Human Gene and Animal Model modules are currently available on the website (http://www.mindspec.org/autdb.html), and the newly developed Protein-Protein Interaction and Copy Number Variant will be released in early 2011.
Conclusions: AutDB provides a publicly available web portal for ongoing collection, manual annotation, and visualization of genes linked to ASD. Importantly, this modular ASD database provides a platform for bioinformatics analysis which can be used to develop predictive disease models for ASD. Such prioritization of molecular pathways may accelerate the field of ASD research and lead to targeted drug treatments for this disorder.