AnVIL Quarterly Release Communications: August 2025 Release
August 28, 2025
The AnVIL team is excited to announce the release of the following studies below on the AnVIL platform. If you are interested in these datasets, please submit data access requests through dbGaP. The datasets are now findable on the AnVIL Data Explorer for cohort building and on the AnVIL Data Library for dataset level search.
For additional resources for how to find datasets on AnVIL, please refer to the AnVIL Data Explorer Guide or DUOS Data Library user guide.
Study Name | phsID / DULs | Release Notes | Submitter blog post | Where to apply for access | Link to dataset on AnVIL |
---|---|---|---|---|---|
Common Fund (CF) Genotype-Tissue Expression Project (GTEx) | phs000424.v10.p2 GRU | Sample and subject annotation files were added. | Link | dbGaP | Data Workspace |
NHGRI GREGoR Consortium: Genomics Research to Elucidate the Genetics of Rare Disease | phs003047.v3.p2 GRU, HMB | The new data release consists of 8,840 participants and more than 3,000 families. Included in the release are short-read whole exomes and genomes, long-read whole genomes, and RNA-seq files. | Link | dbGaP | Explorer Data Library |
Impact of Genomic Variation on Function (IGVF) Consortium | phs003472.v1.p1 HMB-MDS | This first release contains data from seven participants and includes assay data such as single-cell ATAC-seq, single-cell RNA sequencing, and SHARE-seq data. | N/A | dbGaP | Explorer Data Library |
Genomic Answers for Kids | phs002206.v5.p1 DS-PEDD-IRB | The new data release includes over 2,000 long-read genome sequences and 12,000 short-read genome and exome analyses, nearly 400 snapshots of patient transcriptomes and epigenomes in individual cells using single-cell RNA (scRNA) and sc open chromatin (scATAC), over 3,000 bulk whole genome bisulphite genome sequences for methylome interpretation, and over 200 functional assessments in available patient tissues using full length cDNA sequences by IsoSeq (PacBio) methodology. This release also consolidates data from release 4 and 5 into a single dataset for exporting. | Link | dbGaP | Explorer Data Library |
Center for Common Disease Genomics [CCDG] - Neuropsychiatric: Epilepsy: Epi25 Consortium | phs001489.v4.p2 32 consent codes. For a full list, please see dbGaP study page. | This new data release includes whole genome genotype data on over 30,000 Epi25 participants, generated using Illumina's Infinium GSA-MD v1 platform. Additionally, detailed clinical phenotypes related to epilepsy diagnosis are now available for both the GSA data as well as the whole exome sequencing (WES) data previously released in v3. | Link | dbGaP | Explorer Data Library |
CARD Consortium: North American Brain Expression Consortium | phs001300.v5.p1 (parent) phs003181.v2.p1 (child) GRU | This new release includes 206 samples with haplotype-resolved assemblies, structural and small variant calls, as well as methylation calls for neurologically 'normal' prefrontal cortex ( and cortex ) brain tissue samples. | Link | dbGaP | Explorer Data Library |
CARD Consortium: Gene Expression in Postmortem DLPFC and Hippocampus from Schizophrenia and Mood Disorders | phs000979.v4.p2 GRU | This new release includes 155 samples with haplotype-resolved assemblies, structural and small variant calls, as well as methylation calls for neurologically 'normal' prefrontal cortex ( and cortex ) brain tissue samples. | Link | dbGaP | Explorer Data Library |
PAGE: The Charles Bronfman Institute for Personalized Medicine (IPM) BioMe Biobank | phs000925.v1.p1 GRU | Please see dbGaP for more information. | N/A | dbGaP | Explorer Data Library |
PAGE: Multi-Ethnic Cohort Study | phs000220.v2.p2 GRU | Please see dbGaP for more information. | N/A | dbGaP | Explorer Data Library |
PAGE: Global Reference Panel | phs001033.v1.p1 GRU | Please see dbGaP for more information. | N/A | dbGaP | Explorer Data Library |
In addition to the data released above, the following are developmental enhancements made to the AnVIL data:
- Inconsistencies in snapshot naming conventions that were causing issues with indexing for the AnVIL Data Explorer have been resolved.
- MD5s for file metadata are now consistently encoded to Base64 to be consistent with what is provided in the GCS metadata.
- Values that were causing issues importing data from DUOS or the AnVIL Data Explorer into Workspaces have been corrected.
- The presence of double-pipes that was causing issues with indexing certain datasets for the AnVIL Data Explorer has been resolved.
Help us make these docs great!
All AnVIL docs are open source. See something that’s wrong or unclear? Submit a pull request.
Make a contribution