Step 3 - Prepare for Submission

In this step, you will prepare for data ingestion by organizing all required data and metadata in your data model (step 2) in a format compatible with AnVIL.

AnVIL accepts two types of data

Object files (large, unstructured data files such as CRAMs, BAMs, VCFs, etc.) Object files include genomic and other omics data as well as image files. Object files require minimal metadata, some of which is generated by the AnVIL (such as full paths to the files in AnVIL cloud storage).
Phenotypes and metadata (tabular data) Clinical and phenotypic data as well as object file metadata will be submitted in TSV/TXT format (see requirements in the linked step-by-step instructions below). You will generate TSV/TXT files for all the tables in your data model (from Step 2 - Set up Data Model).

Most studies are submitting both.

Formatting requirements for submitted data

To prepare data for submission, you will

Make sure all object files conform to AnVIL’s naming requirements
Generate a TSV file conforming to AnVIL requirements for each table in the data model

Step-by-step Instructions

For details, see How to prepare data for submission to AnVIL (estimated read time 15 minutes).

Help us make these docs great!

All AnVIL docs are open source. See something that’s wrong or unclear? Submit a pull request.

Make a contribution