ISMB 2020 - Finding and Analyzing Data in the Cloud with Gen3, Dockstore, Terra, and Galaxy
Overview
The era of big data for biomedical research is here. Massive data sets and cloud-based platforms will enable breakthrough discoveries while overcoming challenges of cost, accessibility, and security. A key strength of this new research landscape is the availability of interoperable, community-driven components that enable robust analyses for a variety of research needs.
One challenge to fully realizing this vision for your research is not only learning how several new products and platforms work, but at the same time learning how they work together . In this full-day tutorial, we will guide you through a research journey that highlights the capabilities and components of the NHGRI Genomic Data Science Analysis, Visualization and Informatics Lab-space (AnVIL) resource. You will integrate a suite of interoperable platforms to complete a sample project, gaining working knowledge of how the components work together to perform an end-to-end genetic analysis.
Specifically, you will learn how to:
- Find and access data in Gen3
- Locate analysis tools in the Dockstore repository
- Bring these data and tools together into a computational workspace in Terra
- Process data with automated, reproducible analysis pipelines
- Leverage Hail and Bioconductor in Jupyter Notebooks to do interactive analysis
- Perform genome-wide association studies with Galaxy workflows
Audience
While we will work in the context of AnVIL, you will be able to apply your new skills to myriad other genomic-related data sets and tools. Attendees must bring a WiFi-enabled laptop with the Chrome browser installed. Prior coding experience (R and/or Python) is required.
Schedule Overview
- Section I: Introduction
- Section II: Finding and analyzing data in the cloud with Gen3, Dockstore and Terra
- Find and access data in the Gen3
- Locate analysis tools in the Dockstore repository
- Export both data and tools to Terra and run an analysis
- Section III: Interactive analysis
- Find data
- Hail with Jupyter Notebooks in Terra
- Bioconductor with Jupyter Notebooks in Terra
- Section IV: Genome-wide association study workflows
- Galaxy workflows and complementary components
More Info
https://www.iscb.org/ismb2020-program/tutorials#tut2
Registration
https://www.iscb.org/ismb2020-registration
Workshop Videos
Videos of this workshop can be found below. Videos for the entire ISMB 2020 event can be found on the event's YouTube playlist.