NHGRI Analysis Visualization and Informatics Lab-space

News

Data Release - ALS Compute Project

Posted: September 5, 2023

The ALS/FTD (Amyotrophic Lateral Sclerosis/Frontotemporal Dementia) research landscape in the U.S. comprises six major whole-genome sequencing (WGS) efforts: Answer ALS, CReATe Consortium, GTAC, New York Genome Center ALS Consortium, National Institutes of Health, and Project Mine USA. However, operating separately, these efforts lead to inefficiencies, duplicated costs, and hindered collaboration due to a lack of data harmonization. The fragmented nature of the research also inhibits independent investigations into ALS/FTD genetics by other researchers.

In response to these challenges, the ALS Compute Project was initiated with the overarching objective of overcoming obstacles and streamlining research efforts. The project's goals include consolidating raw WGS data from all US-based ALS/FTD initiatives into a single cloud-based platform, obtaining raw WGS data for a large cohort of controls with minimal restrictions, harmonizing the resulting dataset through realignment and joint genotyping, standardizing the phenotypic dictionary of ALS/FTD case samples, and simplifying access to the harmonized cohort for the scientific community. Importantly, a major goal for ALS Compute is to increase access to ALS genetic data, especially in countries or institutions that do not have the infrastructure to handle such large datasets thus increasing the diversity of the scientific community, ultimately leading to novel discoveries.

The ALS Compute Project has received strong support from the ALS/FTD WGS sequencing groups, with all agreeing to contribute their raw data. To date, over 7,000 WGS samples have been released on the AnVIL platform, and approximately 35,000 control samples have been identified for inclusion in the harmonization efforts (dbGaP:phs003184). While interested researchers will need to apply for authorization to access these protected datasets via dbGaP, both the raw data and the metadata will be accessible through the AnVIL platform.

The discovery of ALS genes has been transformative for understanding and treating this neurodegenerative disease. The ALS Compute platform, by centralizing WGS data and promoting collaboration, seeks to maintain momentum in gene discovery for ALS and FTD. The project's use of AnVIL as an open central repository for data consolidation fosters efficiency, accessibility, and inclusiveness in research efforts.

By embracing the ALS Compute platform and utilizing cloud-based technology, researchers can harness the potential of harmonized data, advancing the understanding and treatment of these debilitating neurological conditions. The project sets the stage for a new era of gene discovery in ALS and FTD research, emphasizing collaboration and cost-effective approaches to benefit the scientific community and patients alike.

ALS Compute was founded by John Landers, Ph.D., professor of neurology at the University of Massachusetts Chan School of Medicine; Bryan Traynor, M.D., Ph.D., senior investigator in the Laboratory of Neurogenetics at the National Institute on Aging; and Jonathan Glass, M.D., professor of neurology and pathology at Emory University School of Medicine.

This project was substantially supported by a grant from The ALS Association and was recently awarded a NIH grant (R61NS13060) to further this work. More information can be found in an ALS Association news article.

Scheduled Downtime for AnVIL Data Storage Workspaces for Data Migration on February 15-19, 2024Reporting Temporary Data Unavailability for AnVIL_HPRC
Improve this pageContent guide