Patient-Specific Prediction of Epigenomes Through Deep Learning
Principle Investigator: Emily Miraldi, PhD
Sequencing the DNA of patients with rare and complex diseases is a common strategy to identify potential genetic drivers of disease and therapies. Yet there is a gulf between identification of disease-associated genetic variants and effective therapeutic strategies. Most of the disease-associated genetic variants fall outside protein-coding sequences and overlap with “enhancers,” regions of chromatin that control gene expression. Importantly, enhancers are typically context-specific, contributing to gene expression in particular cell types (e.g., skin, brain) under particular conditions (e.g., in response to nutrients or a pathogen). Although experimental techniques to identify enhancers have advanced greatly (even to single-cell resolution), it is, of course, impossible to measure all cell types and conditions in a single patient, especially for difficult-to-access tissues like brain or liver. Thus, there is a need for computational approaches, to predict the effects of genetic variation on enhancers and gene expression for diverse cell types under many conditions.
We will build a novel computational method for analysis of patient DNA sequence. We will build mathematical models to predict how a patient’s DNA sequence changes gene regulation (by enhancers) in particular cell types. As a proof-of-principle first application, we will build a model for an immune cell type (T regulatory “Treg” cells) that is known to play a role in several autoimmune diseases. We will apply our Treg model to thousands of DNA sequences (genomes) from patients with autoimmune disease to predict how genetic disease risk variants in this key immune cell type might drive disease. Long-term, these predictions could aid in the design of patient-specific therapies.