Data Release - GREGoR Consortium
The fourth data release from the GREGoR Consortium (Genomics Research to Elucidate the Genetics of Rare diseases) is now available on AnVIL for controlled access by the broader scientific community. Additionally, the GREGoR Consortium has developed a publicly available open-access variant browser that enables searches for variants of interest across the Consortium joint, short-read whole genome callset.
Access requests for these controlled datasets are available through dbGaP under accession phs003047 and the data is available at https://explore.anvilproject.org/datasets or https://duos.org/datalibrary/anvil.
The GREGoR Consortium has assembled a dataset of broad utility. All participants are broadly consented for General Research Use or Health, Medical, or Biomedical research. The Consortium data aligns with the GREGoR Data Model, designed to provide context and information to support analysis and secondary use of the data. Additional documentation, including how to apply for access, is available on the GREGoR Consortium’s Data webpage.
This fourth data release contains data from 10,683 participants in 4,366 families. For these participants, the GREGoR Consortium Dataset includes family, pedigree and phenotype information. Genomic data, such as short-read DNA and RNA sequencing data, are available for the majority of GREGoR participants (see table below). The GREGoR Dataset also includes candidate genetic findings identified by participating GREGoR Research Centers. In this release, a subset of short-read whole genome sequencing data is uniformly processed by the GREGoR Data Coordinating Center, which is used to generate a Consortium joint callset for small variants (SNVs and Indels).
Available Data
| Release R01 | Release R02 | Release R03 | Release R04 | |
|---|---|---|---|---|
| Release Date | September 2023 | November 2024 | July 2025 | October 2025 |
| Participants | 2,512 (1,130 affected) | 7,394 (3,555 affected) | 8,840 (4,127 affected) | 10,683 (4,933 affected) |
| Families | 990 | 3,059 | 3,610 | 4,366 |
| Short-read whole exomes | 997 | 2,242 | 2,284 | 2,629 |
| Short-read whole genomes | 1,441 | 5,180 | 6,535 | 8,161 |
| Long-read whole genomes | 0 | 214 | 1,772 | 2,648 |
| RNA-seq files | 192 | 539 | 860 | 1,100 |
| Genome Build | GRCh38 | GRCh38 | GRCh38 | GRCh38 |
Acknowledgements and attribution
The GREGoR Acknowledgements and attribution statement are available at the dbGaP study page. You can learn more about GREGoR at https://gregorconsortium.org/.