NIH Cloud Platform Interoperability Effort

NCI Cancer Research Data Commons (CRDC)


Giving researchers a place where they can work together to access diverse data types for integrative analysis, furthering the goals of precision medicine and biomedical discoveries.


Provide access to standardized and harmonized cancer data in an expandable cloud-based infrastructure, enhancing the way data are shared to empower researchers to work in real time and with more connectivity.


To provide interoperable resources through federation, data harmonization, standards, and tools and services that can be reused across the research community and to enable enhanced data sharing.


The CRDC is funded by NCI Moonshot.


Anand Basu, Andrey Fedorov, Bill Longabaugh, Bob Grossman, Brandi Davis-Dusenbery, Brian O’Conner, David Pot, Melissa Haendel, Ron Kikinis, Sam Volchenboum, Chris Chute, Clare Bernard.


Brigham and Women’s Hospital, Enterprise Science and Computing (ESAC), Frederick National Labs, General Dynamics Information Technology, Institute for Systems Biology, Oregon State University, Seven Bridges, The Broad Institute, University of Chicago, Johns Hopkins.


Data Repositories

New genomic, proteomic, imaging, canine, and clinical trial data being added through both existing and new data nodes on a continual basis.


More Information


Cloud Resources

  • Seven Bridges - 400+ publicly available tools and workflows in Common Workflow Language, + Dockstore, Rstudio, Jupyter notebooks, collaborative genome browser
  • Broad - 700+ publicly available workflows and tools in Workflow Development Language, Integrated Genome Viewer, Dockstore, Jupyter notebooks, BigQuery, ML, pipelines
  • ISB-CGC - Google: VMs, BigQuery, AI, ML, Pipelines, Cohorts, Image Viewers, Notebooks, Plotting, Dockstore
  • Bring your own tools, integrative analysis is available.

Repositories Resources

  • GDC: Data Analysis Visualization Exploration (DAVE) tools
  • PDC: Pepquery, Morpheus, Genome Browser, DDA & DIA common data analysis pipelines
  • Infrastructure: Cancer Data Aggregator (CDA), Center for Cancer Data Harmonization (CCDH), Data Commons Framework (DCF)

Analytical Tools



  • NCI Data Commons Framework Services (DCFS) by Gen3
  • Researcher Authentication Service (RAS)
  • eRA Commons IDs (controlled data)
  • Individual, OIDC platform authentication


  • Permanent globally unique IDs (GUIDs) for data in Google & Amazon locations
  • GUIDs are cloud agnostic, promoting access and providing a mechanism for versioning data


  • dbGaP access
  • DCFS by Gen3
  • Authorization enabled by Trusted Partnerships with NIH

Data Models

  • There are many data models across the CRDC, including ICDC, CTDC, PDC, and GDC
  • Center for Cancer Data Harmonization (CCDH) develops overarching model and mapping
  • CRDC also participates in GA4GH efforts


User Perspective

CRDC Architecture

System Perspective

CRDC Architecture

Improve this pageContent guide