Principal Investigator: Ge Zhang, MD, PhD
Genome-wide sequencing data are increasingly important in clinical evaluation of human diseases. However, the large number of variants observed in individual patients brings challenges in translating variant calls into clinical knowledge. An individual patient’s exome can have hundreds of rare and deleterious variants that need extensive interpretation. Many bioinformatics tools exist to annotate the functional consequences of variants. However, these tools usually use a relational database design and store the same type of functional data of individual genomic elements as distinct records in highly structured tables, which does not facilitate data integration.
Graph database, an alternative to the relational database, is superior in managing highly interconnected data and is capable of collecting and aggregating information from millions of nodes and relationships. Graph database has attracted more and more attention because most of the data in biology can be naturally described by a graph-like structure. This idea broadly falls into the field of network biology, which tries to understand complex biological systems by organizing, analyzing and visualizing knowledge using graph (i.e., network). Inspired by these facts, we propose to develop an efficient computational tool that can provide comprehensive and integrative functional annotation of genomic variants using a graph database approach.