Consortium Guidelines for AnVIL Data Access
NHGRI Expectations for Consortia Access to data in the AnVIL
The NIH Genomic Data Sharing (GDS) Policy requires that NIH-funded researchers generating ‘large-scale’ genomic data release such data in a timely fashion (see the Data Release Timeline in the Supplemental Information to the NIH GDS Policy).
All data produced with NHGRI support should be shared with the community rapidly, completely, and in NIH-designated data repositories.
This expectation goes beyond genomic data alone. NHGRI also expects as much phenotypic data (stripped of HIPAA identifiers) as possible to be shared, beyond the variables used for the first study publication. All supporting meta-data should be well documented (e.g., data element dictionaries, data collection protocols, study inclusion, and exclusion criteria).
AnVIL will facilitate NHGRI consortia access to data in a cloud environment. Access to consortium data by consortium members is not governed by the NIH GDS Policy (i.e., Data Access Committee approval is not required). However, it is important to remember that consortia must still have appropriate IRB oversight, Memorandums of Understanding (MOUs), and Data Use Agreements (DUAs), as necessary.
It is up to the consortium to determine who is a member of the consortium, and thus manage access to data. NHGRI trusts that this streamlined access to consortium data via AnVIL will accelerate the time it takes for data to be shared with the broader community, either through unrestricted or controlled-access data repositories.
Consortium Member Responsibilities
Consortium members are responsible for ensuring data remain securely within the consortium. They will have the ability to create Terra Authorization Domains for providing access to personnel under their supervision (or for designating this responsibility to a ‘designated access approver’).
- Must establish Two Factor Authentication on your Google Account;
- Must not provide access to users without the appropriate permissions and supervision;
- Must ensure all personnel under your supervision are aware of and adhere to all data use limitations and all terms of consortium agreements (MOUs, DUAs, etc.); and,
- Must report any potential data security incidents to the Consortium contact person and AnVIL staff within 24 hours and follow any consortium-specific protocols as necessary.
Consortium Contact Person Responsibilities
The contact person is responsible for ensuring that the list of consortium members approved for access is current and accurate, and for helping to ensure that the consortium members handle pre-release genomic and associated data responsibly. This persons’ responsibilities include:
- Ensuring all consortium members confirm they have Two Factor Authentication active on their Google Account before granting them access;
- Ensuring that the Terra Authorization Domain, i.e., the list of consortium members (different from any lab-specific Authorization Domains) is current;
- Updating in a timely manner the list of approved consortium members when a member leaves the consortium;
- Ensuring that consortium members only have access to appropriate datasets (i.e., if there are multiple datasets, and a consortium member is only approved for access to a subset of those datasets, these limited access permissions should be organized via distinct Terra Authorization Domains);
- Ensuring that all consortium members are aware of all data use limitations and the terms of the consortium agreements; and,
- Ensuring that the consortium has a policy and protocol for data security and management incidents, and working with consortium members, AnVIL staff, and NIH staff to implement those protocols as necessary.
General Do’s and Don’ts for Consortia: Managing Access to Consortium Data
- Do have clear documentation of data use limitations, including any additional approvals that may be needed, particularly if different subsets of the consortium data have different data use limitations;
- DO ensure that individual investigators are aware of and obtain any necessary approvals beyond consortium membership for doing secondary research on the data (e.g. additional IRB approval, additional collaborator approval, etc.);
- DO clearly define what it means to be a consortium member;
- DO NOT use this process to provide streamlined access to any researchers that are not part of the consortium;
- DO have a plan with clear timelines and milestones for sharing data with the community rapidly, completely, and in NIH-designated data repositories; and,
- DO have a timeline for closing out the consortium and thus the consortium-managed, streamlined access for consortium members. Communicate this timeline with relevant NHGRI program staff and provide AnVIL with 6 months’ notice of when the consortium’s data access is expected to end.