NHGRI Analysis Visualization and Informatics Lab-space



Tools for Applied Data Science Using Cloud-Based Platforms

Virtual Training
Thursday, April 22, 2021 11:00 AM to 1:30 PM EDT
Friday, April 23, 2021 11:00 AM to 1:30 PM EDT

VADSTI 2021 Module 7

The AnVIL team will present an overview of the AnVIL platform in Module 7 of VADSTI 2021.

The AnVIL is a cloud-based platform that supports the management, analysis, and sharing of biomedical data for the NHGRI research community.

AnVIL aims to advance our basic understanding of the genetic basis of complex traits and accelerate the discovery and development of therapies, diagnostic tests, and other technologies for diseases like cardiovascular disease or autism spectrum disorders

The platform currently hosts more than 75,000 whole human genome data sets and offers a variety of analysis capabilities including:

  • Terra for large scale batch computing and interactive computing;
  • Gen3 for managing, analyzing, harmonizing, and sharing large datasets;
  • Dockstore for sharing Docker-based analysis workflows;
  • Jupyter notebooks for organizing live code, equations, visualizations, and narrative text into a single document;
  • R Studio for interactive machine learning, statistical computing, and visualizations;
  • Bioconductor for community-driven interactive genomics with R;
  • Galaxy, for accessible, reproducible, and transparent genomic science.

In this module, you will be introduced to the platform, tools, and functionality for data science projects.


With the recent advancements in technology and computational tools, healthcare services, and clinical and genomic sciences can store large amounts of datasets. There is therefore increased demand for researchers to utilize data analytics capabilities to look at recent trends, predict outcomes, and make better clinical and health policy decisions.

Skill sets in data science are critical for advancing the science of minority health and health disparities.

The Howard University Research Centers in Minority Institutions, RCMI, Program with funding from NIH created the VADSTI to meet the growing data science demand and their application to problems of minority health and health disparities.

More Information / Registration

For more information and to register see Virtual Applied Data Science Training Institute (VADSTI)

Workshop Videos

Tools for Applied Data Science Using Cloud-Based Platform, Module 7, Day 1

Tools for Applied Data Science Using Cloud-Based Platform - Module 7, Day 2

Using R / Bioconductor in AnVILFind and Analyze Data in the Cloud with Gen3, Dockstore and Terra
Improve this pageContent guide