NHGRI Analysis Visualization and Informatics Lab-space


Data Consortia

AnVIL hosts data from the following programs and also allows users to submit their own data to the platform.

Current Consortia


The Centers for Common Disease Genomics are a collaborative large-scale genome sequencing effort to comprehensively identify rare risk and protective variants contributing to multiple common disease phenotypes.


The Centers for Mendelian Genomics is a multi-center collaboration aimed at identifying the genes responsible for Mendelian phenotypes by whole exome and whole genome sequencing


The Genotype-Tissue Expression (GTEx) project is an ongoing effort to build a comprehensive public resource to study tissue-specific gene expression and regulation. Samples were collected from 54 non-diseased tissue sites across nearly 1000 individuals, primarily for molecular assays including WGS, WES, and RNA-Seq.

1000 G

The 1000 Genomes Project, launched in January 2008, is an international research effort to establish variation profiles across the human population. This open access data set continues to be a valuable resource to geneticists.


The Electronic and MEdical Records and Genomics project (eMERGE) is a national network organized and funded by the NHGRI that combines DNA biorepositories with electronic medical record (EMR) systems for large scale, high-throughput genetic research in support of implementing genomic medicine.


The Population Architecture Using Genomics and Epidemiology Consortium investigates ancestrally diverse populations to gain a better understanding of how genetic factors influence susceptibility to disease.


The Human Pangenome Reference Consortium aims to modernize the human reference to include a collection of diverse and highly accurate, haplotype-phased genome assemblies. This initiative will generate new technical standards in genome sequencing, scalable and reproducible assembly methods, and pangenomic tool development to ensure comprehensive variant discovery.

Planned Consortia

The following consortia are planned for data ingestion. Additional consortia are under consideration and will be listed as they are approved.

  • Covid19hg - The COVID-19 host genetics initiative
  • CSER - Clinical Sequencing Evidence-Generating Research
  • GTEx v9 - Genotype-Tissue Expression Project
  • NIA - National Institute of Aging
  • NIMH - National Institute of Mental Health
  • UDN - Undiagnosed Disease Network
What is AnVIL?Platform and Data Security
Improve this pageContent guide