AnVIL Dataset Catalog

T2T

The Telomere-to-Telomere (T2T) consortium is an open, community-based effort to generate accurate and gap-free assemblies of the human genome and the genomes of other species. The initial focus was on de novo assembling the first complete reference human genome known as CHM13.

Leveraging PacBio HiFi sequencing and Oxford Nanopore ultra-long reads, the CHM13v1 reference genome boasts remarkable features. These include an estimated sequence accuracy exceeding QV70, correction of structural errors in the GRCh38 reference genome, and the addition of over 100 Mbp of novel sequence compared to GRCh38.

CHM13v1 unlocks complex regions of the genome for clinical and functional study. Additionally, the T2T-CHRY Workspace utilizes the T2T-CHM13v2.0, which provides the first complete sequence for a human Y chromosome from a separate donor (HG002).

T2T-CHM13v2.0 was also used as a reference genome for investigating short-read variant calling, incorporating data from the 1000 Genomes Project and the Simons Genome Diversity Project. Another effort from the T2T consortium is the T2T-GreatApes Project which employs PacBio HiFi and Oxford Nanopore ultra-long reads, advancing our understanding of great ape genomics. It evaluates the impact of T2T-chrXY assemblies on read alignments and variant calling across 129 individuals from 11 great ape subspecies, providing reference genomes for various ape species.

Consent Codes

NRES

Diseases

None

Study Design

Parent-Offspring Trios, Population sampling

Data Types

Whole Genome

Subjects

3,202