Pestian Lab
I2B2 Data Warehouse & Natural Language Processing

I2B2 Data Warehouse & Natural Language Processing

Medical researchers have "...the need to analyze large amounts of data in an interactive manner quickly with no opportunity to rely on the existence of canned queries...." * A star schema design provides structures that enable faster interactive analyzing of data. "The star data model [is] a design that makes slicing and dicing one specific subject area easy and fast."* An i2b2 data mart is best suited for tasks like cohort identification, hypothesis generation and retrospective data analysis. It is NOT well suited for tasks like clinical trials, sample tracking, study administration or providing real-time alerts.

The i2b2 data mart is a data warehouse modeled on the star schema structure. It contains fact tables, to house facts about patients. It also contains tables that provide additional information about fields in the fact table, called dimension tables. Facts are defined by concept codes. The hierarchical structure of these codes, together with their descriptive terms and some other information, forms the i2b2 ontology (aka: metadata).

*A Layman's Understanding of Star Schemas" by Kevin Meade, 2009-09-26.