Discovering the Genetic Basis of Human Neuroblastoma: A Kids First Project
The Gabriella Miller Kids First Pediatric Research Program (Kids First) is a trans-NIH effort initiated in response to the 2014 Gabriella Miller Kids First Research Act and supported by the NIH Common Fund. This program focuses on gene discovery in pediatric cancers and structural birth defects and the development of the Gabriella Miller Kids First Pediatric Data Resource (Kids First Data Resource).
All of the WGS and phenotypic data from this study are accessible through dbGaP and kidsfirstdrc.org, where other Kids First datasets can also be accessed.
Children with disseminated neuroblastoma have a very high risk of treatment failure and death despite receiving intensified chemotherapy, radiation therapy and immunotherapy. The long-term goal of our research program is to ultimately improve neuroblastoma cure rates by first comprehensively defining the genetic basis of the disease. The central hypothesis to be tested here is that neuroblastoma arises largely due to the epistatic interaction of common and rare heritable DNA variation. Here we will perform a comprehensive whole genome sequencing of 563 quartets of neuroblastoma patient germline and diagnostic tumor DNAs and germline DNAs from both parents. The case series was recently collected through a Children's Oncology Group epidemiology clinical trial and is robustly annotated with complete demographic (age, sex, race, ethnicity), clinical (e.g. age at diagnosis, stage, risk group), epidemiologic (parental dietary and exposure questionnaire) and biological (e.g. tumor MYCN status and multiple other tumor genomic measures) co-variates. Subjects were consented for genetic research and DNA is immediately available for shipment for sequencing. We propose Illumina-based whole genome sequencing in the 593 "trio" germline samples (Aim 1; due to missing parent: 487 full neuroblastoma triads, 106 child-single parent dyads = 1673 whole genome sequences) and matched diagnostic tumor DNA (Aim 2; N=366) at 30x sequencing depth (N=2039 whole genome sequences). Also in Aim 2 we will perform whole exome (100x) and RNA sequencing on the 366 tumor DNA and 228 tumor RNA samples from this cohort. Finally, we propose a pilot study of structural variation using long-range sequencing in 10 non-overlapping tumor samples chosen based on potentially relevant chromosomal alterations discovered with conventional NGS. Thus, a total of 2277 individual samples and 2655 sequences will be generated. We will use our established analytic pipeline that is currently being used to study the germline genomes of all cases sequenced through the NCI supported Therapeutically Applicable Research to Generate Effective Treatments program. We plan a three-stage analytic approach, first focusing on classic de novo and inherited Mendelian damaging alterations. We will next integrate our extensive epigenomic data from human neuroblastoma cell lines and genome-wide association study data (N=5,703 neuroblastoma cases to date) to guide a comprehensive assessment of noncoding variants that influence tumor initiation with a recently established analytic pipeline. Finally, we will utilize the tumor DNA analyses to inform relevance via somatic gain or loss of function effects at the sequence and/or copy number levels. All data generated in this project will be immediately placed into the Genomic Data Commons (GDC) and we will compute within this environment by importing our analytic pipelines into the GDC. These data will be fully integrated into the Kids First Data Resource and freely shared with all academically qualified petitioners. This comprehensive data set derived from a large and richly phenotyped series of neuroblastoma DNA quartets will be integrated with existing germline and/or tumor genomic data from over 6,000 neuroblastoma subjects (but none with matched patient-parent germline sequencing data) to provide an unparalleled opportunity to comprehensively discover the genetic basis of neuroblastoma.