NHGRI Analysis Visualization and Informatics Lab-space


Data Submission

How do I submit data to AnVIL for sharing with the scientific community?

AnVIL aims to host a variety of datasets useful to the genomics community. Submission requires the following steps:

  1. If you have questions about submitting your data to AnVIL please contact the AnVIL team (anvil-data@broadinstitute.org).
  2. To apply for data submission, complete the AnVIL Dataset Onboarding Application form.
  3. The AnVIL program will assess whether the dataset is a good fit for AnVIL. Some aspects of the dataset evaluated include: the amount of genomic data, phenotypic and clinical data, and relevant metadata; appropriate ethical oversight data collection, and; the nature of the informed consent for broad data sharing (i.e., restrictiveness of data use limitations).
  4. If the study is approved, contact the NHGRI Genomic Program Administrator (GPA) to register the approved study in dbGaP.
  5. AnVIL staff will contact you promptly to initiate Data Ingestion. For more information about data transfer and QC, see Ingest Data in the Data Submission Guide.

Who pays for storage costs in AnVIL?

AnVIL will cover storage costs of data from NHGRI funded studies that has been or will be released publicly to the research community (including those registered in dbGaP or DUOS and released through controlled access). Otherwise, storage costs are incurred by the billing account associated with the workspace. For more on understanding and controlling cloud costs in AnVIL, see Understanding Cloud Costs.

For more on preparing a budget justification for cloud costs in AnVIL, see Budget Templates.

Does NHGRI plan to move data from dbGaP to AnVIL?

Yes. NHGRI plans to transfer data from selected NHGRI-funded studies to the AnVIL platform. Before this happens, Institutions and the study PIs will be notified of the plan to transfer data and will have an opportunity to consult with IRBs and notify the NHGRI if there are substantive concerns.

What do I need to do if I have data from a study that was explicitly consented for dbGaP but wish to deposit data in AnVIL?

Please discuss this situation with your IRB, the NIH program director for your study and the AnVIL staff. In most cases, the submitting institution will determine whether data may be submitted to AnVIL.

Rather than naming a particular data repository (i.e., dbGaP, AnVIL, etc.) as the data repository for your study, consider indicating that data will be deposited in an ‘NIH-designated data repository’ (see the Informed Consent Resource, Special Considerations for Genome Research for more detailed sample language). This will provide the flexibility to submit to dbGaP, AnVIL, or any other new and relevant NIH-designated repository.

AnVIL facilitates the removal of individual-level data from studies stored and managed by the resource, honoring the right of research participants to change their preferences with regard to future data sharing.

AnVIL’s practices also reflect the practical limits of participant withdrawal and aim to balance participant autonomy with reproducibility and transparency.

See Withdrawing Data from AnVIL for a description of AnVIL's data withdrawal procedures.

Data Security, Management, and Access ProceduresResources for AnVIL Users
Improve this pageContent guide