Principle Investigator: Krishna Roskin, PhD, & Surya Prasath, PhD

Bioinformatic analysis of high-throughput immunoglobulin and T cell receptor sequencing data sets have provided new insights into immune-mediated diseases and are a promising source of disease biomarkers. Extension of these methods and functional interpretation of these sequencing-based biomarkers has been hampered by the lack of a comprehensive database of immunoglobulin and T cell receptor sequences annotated with the metadata needed to decode the adaptive immune response.

We are creating such a database that will draw on data from large high-throughput immune receptor sequencing projects, single-cell data sets that included paired receptor sequences, and the richly annotated “low-throughput” data sets that have been deposited in public repositories over the last 30-plus years. This database also will incorporate bioinformatic tools that will enable querying and mining of this data for biomarkers, specificity inference, and functional testing.