NHGRI Analysis Visualization and Informatics Lab-space


ISMB 2020

Finding and Analyzing Data in the Cloud with Gen3, Dockstore, Terra, and Galaxy

Virtual Workshop
Thursday, July 9, 2020 9:00 AM to 1:00 PM EDT


The era of big data for biomedical research is here. Massive data sets and cloud-based platforms will enable breakthrough discoveries while overcoming challenges of cost, accessibility, and security. A key strength of this new research landscape is the availability of interoperable, community-driven components that enable robust analyses for a variety of research needs.

One challenge to fully realizing this vision for your research is not only learning how several new products and platforms work, but at the same time learning how they work together . In this full-day tutorial, we will guide you through a research journey that highlights the capabilities and components of the NHGRI Genomic Data Science Analysis, Visualization and Informatics Lab-space (AnVIL) resource. You will integrate a suite of interoperable platforms to complete a sample project, gaining working knowledge of how the components work together to perform an end-to-end genetic analysis.

Specifically, you will learn how to:

  • Find and access data in Gen3
  • Locate analysis tools in the Dockstore repository
  • Bring these data and tools together into a computational workspace in Terra
  • Process data with automated, reproducible analysis pipelines
  • Leverage Hail and Bioconductor in Jupyter Notebooks to do interactive analysis
  • Perform genome-wide association studies with Galaxy workflows


While we will work in the context of AnVIL, you will be able to apply your new skills to myriad other genomic-related data sets and tools. Attendees must bring a WiFi-enabled laptop with the Chrome browser installed. Prior coding experience (R and/or Python) is required.

Schedule Overview

  1. Section I: Introduction
  2. Section II: Finding and analyzing data in the cloud with Gen3, Dockstore and Terra
    • Find and access data in the Gen3
    • Locate analysis tools in the Dockstore repository
    • Export both data and tools to Terra and run an analysis
  3. Section III: Interactive analysis
    • Find data
    • Hail with Jupyter Notebooks in Terra
    • Bioconductor with Jupyter Notebooks in Terra
  4. Section IV: Genome-wide association study workflows
    • Galaxy workflows and complementary components

More Info




Workshop Videos

Videos of this workshop can be found below. Videos for the entire ISMB 2020 event can be found on the event's YouTube playlist.

TT02- I - Intro to Terra Overview - Tiffany Miller - Tutorials - ISMB 2020

TT02-II - Get set up in Terra - Allie Hajian - Tutorials - ISMB 2020

TT02-III - Data and documentation in a Workspace - Tiffany Miller - Tutorials - ISMB 2

TT02-IV - Find and import workflows in Dockstore - Tiffany Miller - Tutorials - ISMB 2020

TT02-V - Set up and run your workflow - Tiffany Miller - Tutorials - ISMB 2020

TT02-VI - Workflows outputs and troubleshooting - Jason Cerrato - Tutorials - ISMB 2020

TT02-VII - Interactive analysis plus Hail intro - Allie Hajian - Tutorials - ISMB 2020

TT02-VIII - Bioconductor for RNA seq analysis - Liz Kiernan - Tutorials - ISMB 2020

R / Bioconductor in the Cloud - West SessionMassive Genome Informatics in the Cloud (MaGIC) Jamboree
Improve this pageContent guide