Expanding and Updating the RNA Binding Protein Database CisBP-RNA
Principle Investigator: Matthew Weirauch, PhD
Gene regulation is fundamental to all forms of life. While major advances continue to be made in our understanding of transcriptional regulation, our knowledge of the mechanisms underlying post-transcriptional regulation severely lag behind. RNA binding proteins (RBPs) represent an important class of post-transcriptional regulators. These proteins function by recognizing and binding to short RNA sequence and structural “motifs.” Upon doing so, RBPs control a staggering range of post-transcriptional processes, including splicing, mRNA decay rate, mRNA transport & subcellular location, and RNA editing. The human genome encodes over 1,500 RBPs, which is comparable to the number of encoded transcription factors we have recently identified (>1,600), suggesting that post-transcriptional regulation is likely of similar importance (and complexity) to transcriptional regulation. Likewise, over 149,000 RBPs are encoded in the ~700 currently sequenced eukaryotic genomes.
Despite their abundance and likely importance, our current knowledge on basic RBP function is greatly lacking, including fundamental information such as RBP binding specificities and RBP transcriptome-wide binding sites. To address this need, we released the CisBP-RNA database and website in 2013. CisBP-RNA is a comprehensive collection of RNA binding “motifs” across all eukaryotic RBPs, along with web-based tools enabling advanced RBP analytic capabilities. Our work focuses on making a major upgrade and overhaul of the CisBP-RNA database and web server, including development and evaluation of improved methodologies for RBP binding prediction, updates to the data contained in the database, and creation of new and improved web-based tools for advanced RBP analysis.