Complex Chromosomal Changes Involving Chromosome 9 Data Release on the AnVIL Platform

November 04, 2025

Laboratory of Tychele N. Turner, Ph.D.
Washington University in St. Louis
Department of Genetics

We are excited to announce that our latest dataset is now available through the AnVIL platform (phs004000.v1.p1). This release of the study "Assessment of Complex Chromosomal Changes in De-Identified Cell Lines" focuses on 16 individuals with complex variation involving chromosome 9.

Access requests for this controlled data are available through dbGaP under accession phs004000.v1.p1 and the data is available at https://explore.anvilproject.org/datasets or https://duos.org/datalibrary/anvil.

Sample Selection

The starting point for this project was the Coriell Cell Repositories “Chromosomal Abnormalities” Collection. From this resource, we found a set of de-identified cell lines with structural variants on chromosome 9. These samples provided a valuable opportunity to study complex rearrangements using modern sequencing technologies.

DNA Preparation

To ensure data quality, we requested that Coriell generate new DNA extractions directly from the original cell lines, which are preserved as either lymphoblastoid cell lines (LCL) or fibroblasts. From each line, Coriell produced:

High molecular weight DNA
Standard DNA

Both were derived from the same cell passage, providing consistency and reducing the possibility of variation due to cell culture. This approach gave us a solid foundation for downstream sequencing and analysis.

Sequencing Strategy

We generated two complementary sequencing datasets:

Illumina short-read whole-genome sequencing on all 16 individuals
PacBio long-read whole-genome sequencing on a subset of 8 individuals

Short-read sequencing allowed us to capture high-quality data across the entire cohort, while long-read sequencing provided a deeper view into the structural complexities of chromosome 9 in selected cases. The combination of both approaches provides a dataset that can support a broad range of analyses, from fine-scale variant discovery to genome architecture reconstruction.

Complex chromosomal rearrangements are difficult to study but are highly relevant to understanding genome biology and human disease. Chromosome 9 is particularly notable for its structural variation, and this dataset represents a step toward characterizing those complexities with greater resolution. By making these data available through AnVIL, we aim to create a resource that can be used by researchers developing new methods, validating discoveries, or investigating the biological consequences of these structural variants.

The Dataset

The dataset includes 16 individuals from LCL or fibroblast sources. As mentioned above, all samples have Illumina WGS sequence data and 8 samples have PacBio WGS sequencing completed. Each individual’s dataset represents a unique case of structural complexity, and together they form a dataset that we hope will be useful to the broader genomics community.

Publication and dbGaP link

Please also read our publication: https://link.springer.com/article/10.1186/s13073-025-01563-0
dbGaP link: https://dbgap.ncbi.nlm.nih.gov/study/phs004000.v1.p1/

Help us make these docs great!

All AnVIL docs are open source. See something that’s wrong or unclear? Submit a pull request.

Make a contribution

Sharing Our Data on AnVIL: Complex Chromosomal Changes Involving Chromosome 9

Sample Selection

DNA Preparation

Sequencing Strategy

Why We Are Sharing This Data

The Dataset

Publication and dbGaP link