Cincinnati Rheumatic Diseases Core Center
Bioinformatics and Modeling Core

Bioinformatics and Modeling Core

Bioinformatics and modeling are essential and integral components of modern biomedical research. The overall goal of the Bioinformatics and Modeling Core is to drive the quality control, analysis, integration, modeling, and dissemination of large datasets to understand inflammatory and rheumatic diseases. The Bioinformatics and Modeling Core is led by Matthew Weirauch, PhD, a computational expert in genomics and gene regulation, with substantial contributions from Sandra Andorf, PhD, a computational immunologist and flow cytometry expert.

To elucidate the molecular mechanisms underlying disease, robust statistical and bioinformatics pipelines are needed to quantify differential gene expression, epigenetic and chromatin-state signatures, and protein levels. We will implement innovative genomics and immunophenotyping data management and analytic strategies to interrogate the datasets generated by users of the Cincinnati Rheumatic Disease Resource Center. Publicly available software packages and newly developed, innovative tools will be integrated, automated, applied and made available to the larger research community.

The mission of the Bioinformatics and Modeling Core is to (1) develop strategic design and quality control of the computational tools, (2) identify and address gaps in analytic approaches in emerging technologies, and (3) integrate relevant newly generated and publicly available systems-level datasets into analysis pipelines and release them to the community.

We provide analytic expertise and informatic support for functional genomic experiments. We support quality control and analysis of functional genomics data through streamlined semi-automated pipelines employing best-standard processes. These current RNA-seq, ATAC-seq, ChIP-seq, and CLIP-seq analysis pipelines follow the standards outlined by the ENCODE consortium. We analyze whole genome sequencing, HiChIP, single cell RNA-seq, and spatial transcriptomics data, and perform polygenic risk score analysis, using our established methodologies. For both quality assessment and analysis purposes, we will integrate data using a novel Deep Learning approach, and other methods, by comparing newly generated functional genomics data to our large, unique, uniformly processed public dataset repository (>30,000 functional genomics experiments).

We support genotype-dependent analysis of functional genomic experiments. Genotype-dependent molecular mechanisms are key contributors to virtually all complex diseases. We will leverage the innovative computational techniques developed in the Weirauch lab to identify genotype-dependent TF binding, chromatin accessibility, chromatin looping interactions, and gene expression in functional genomics datasets generated by investigators. We also incorporate pre-computed genotype-dependent data already obtained from 2,265 (and growing) publicly available functional genomics datasets. We make genotype-dependent transcription factor and RNA binding protein predictions based on the DNA/RNA sequence surrounding genetic variants of interest, using nucleic acid binding motifs obtained from our CisBP and CisBP-RNA databases.

We analyze high-dimensional flow cytometry data. Rheumatic diseases have both autoimmune and inflammatory components. Immunophenotyping studies can thus provide valuable insights into rheumatic disease pathophysiology. High-dimensional flow cytometry enables detailed phenotyping of cellular systems in blood and other tissue samples. We provide expertise for proper experimental design, including technical controls that encompass the full range of expression markers in all samples, and batch adjustment strategies for longitudinal studies. Our expertise in normalization and analysis of these high-dimensional datasets provides the rigor and reproducibility necessary for proper experimental design and data analyses. With the possibility to measure 40+ parameters at once in flow cytometry studies, unsupervised and (semi)-supervised methods become integral for data analysis.

We facilitate the organization and deposition of genetic, genomic and proteomic data. We facilitate centralized meta data collection using the Weirauch/Kottyan NGS database, a robust MySQL database with an advanced interface currently containing records describing 3,085 functional genomics experiments generated by our lab and our collaborators. We continue to employ metadata standards for samples, assays, analyses, and experimental results. We coordinate data storage in the Weirauch lab data center and facilitate public database deposition in appropriate data repositories such as NCBI GEO and NIH’s ImmPort database. All newly developed code will continue to be released for open access on the Weirauch lab GitHub page. Data and analysis results is shared with project members through an interactive Wiki interface.

Contact Us

Contact Matthew Weirauch, PhD, for more information or to use our services.