Epic Cosmos
National-level electronic health record data linked with unique phenotype and genotype information are available through All of Us and N3C.
What is Epic Cosmos?
Epic Cosmos is an aggregated electronic health record database made up of data from organizations that use Epic as their electronic health record system. As of 2025, about 200 institutions have shared longitudinal data for 300 million individuals, including Baylor Medicine and Texas Children’s Hospital. These individuals are linked across health systems.
The underlying data model for Cosmos is a streamlined version of Epic Caboodle, a dimensional data model organized in a “star” or “snowflake” schema. In this structure, a table containing clinical information (called a fact table) is connected to metadata tables (called dimension tables).
-
An encyclopedia of available data (Baylor Epic credentials required) includes:
Demographics
Encounters
Diagnoses
Procedure occurrences
ICU events
Vital signs
Medication orders, fills, and administrations
Social drivers of health (SDOH)
Surveys and patient-reported outcomes
Limitations
Data in Epic Cosmos are de-identified, and dates are time-shifted by a random number of days within each individual. These create important limitations that may impact feasibility of some studies. Please refer to the data dictionary to confirm whether your data elements are represented. If you have any questions, please submit a general consultation request.
Institutions that use Epic for only part of their health system (such as inpatient only) will only have that portion represented in Cosmos.
Investigators should be aware of potential non-detection bias, which may be especially important for diseases that require frequent inpatient care.
Information commonly stored in flowsheets — other than vital signs — such as respiratory care events or pain scores, is unavailable.
Results from specialized testing, including pulmonary function tests, EKGs and echocardiograms, are unavailable.
The day of the week (such as weekday or weekend) cannot be identified.
Specific health systems cannot be identified.
Geocoding information is limited to the patient’s state of residence.
-
Tier 1: SlicerDicer
Epic’s graphical interface for generating descriptive statistics
Ideal for summarizing and exploring data
Requires completion of a free self-study module
Tier 2: Line-Level Access via Citrix Cloud
Secure environment with tools like SQL Server Management Studio and RStudio
Enables data extraction, processing, and statistical analysis
Requires a two-day virtual training class (Epic-organized, fee applies)
Access can be requested through the ServiceNow form (Baylor credentials required).
-
Epic requires users to be affiliated with an Epic customer institution. Faculty from the University of Houston and other collaborating institutions must partner with Baylor faculty to participate in Cosmos research projects.
Sign Up
If you’re interested in news and opportunities from CTPH, sign up for our communications.