Monthly introduction sessions¶
The Genomics England Research Environment provides access to Genomics England data, including genomes, variants and phenotypic data from rare disease and cancer patients from the 100,000 Genomes project and NHS Genomic Medicine Service. Due to the sensitive nature of the data, all analyses on these data must be carried out within the Research Environment and only non-identifiable aggregate data can be exported. To enable this, a variety of tools are available within the Research Environment to segment and analyse the data.
This training session is aimed at newcomers to the Genomics England Research Environment and will introduce what is in the Research Environment, both in terms of data and tools. The basic functionality of the tools will be covered, along with how you can export data and the restrictions on doing this.
Timetable¶
13.30 Welcome and introduction
13.35 Sources and type of data in the Research Environment
13.50 Tools in the Research Environment
14.10 Programmatic access to Genomics England data
14.20 Running command line tools and pipelines using our HPC cluster
14.30 The Airlock, restricted import and export of data
14.45 Getting help and questions
Learning objectives¶
After this training you will know:
- what data can be accessed in the Genomics England Research Environment
- the functions of the Participant Explorer, LabKey, IVA and IGV
- what APIs are available for exploring the data
- the kinds of jobs you can run on the HPC cluster and when you might use it
- how to import and export data from the Genomics England Research Environment using Airlock
- how to use the documentation to learn more
Target audience¶
This training is aimed at researchers new to the Genomics England Research Environment
Dates¶
These sessions are heard on the third Tuesday of every month. You can sign up for future sessions:
Date | Details and registration |
---|---|
18th February | register |
18th March | register |
15th April | register |
20th May | register |
22nd July | register |
19th August | register |
16th September | register |
21st October | register |
18th November | register |
16th December | register |
Materials¶
You can access the redacted slides and video below. All sensitive data has been censored.
Slides¶
Video¶
Give us feedback on this tutorial
Q&A
21/01/2025¶
I have a technical Question - I am trying to gain access to Genomics England but have encountered two issues. First, after completing the Governance Training and receiving my certificate, the website has not updated to reflect my completion. Despite refreshing the page and retaking the quiz multiple times, the issue persists. Additionally, I am unable to log a support ticket as my credentials are not being accepted, and I keep receiving a "wrong password" message even though I am entering the correct password.
https://research.genomicsengland.co.uk/SignIn?returnUrl=%2Fresearch-registry%2Fbrowse%2F
research-network@genomicsengland.co.uk
I have a question regarding genomic files. I have established my cohort using participant explorer and downloaded that table with all the file paths to the participants genomic data and their corresponding ID. I would like to compile all the vcf files onto one folder on my RE, is there a way to do this? I have around 300 participants in my study so I would like to use BCFmerge in bcftools and run my analysis on the HPC. hope that makes sense!
You may also want to look at using the aggregates for this type of work as the aggregation, allele frequency calculations and functional annotations have been previously geneated. Filtering the aggV2 for your samples of interest will most likely save you a significant amout of time.
Question regarding participant explorer, when I copy paste the file path of the genome data from participant explorer into file system, it cannot locate the file, am I doing something wrong?
the paths may be slightly different between the desktop and the HPC.
HPC paths are “absolute”, whereas desktop paths are relative to ${HOME}.
Please raise a service desk ticket if you are still experiencing issues.
Does either Labkey or Participant explorer allow you to search by genetic variant e.g SNP ID?
live answered
Does the participant explorer help in identifying new genes in patients with a specific disease?
some tables will have variant information but the majority of tables will provide you with secondary data on the participant and pathways for raw data files (VCFs)
so the data is pseudonymised, but anonymised to GEL?
live answered
who sits on Airlock committee?
The airlock committee is comprised of GEL staff with a broad range of experience and knowledge, this will include both bioinformaticians and policy experts.
should there be NHS representation if NHS info being fed in?
The information provided to the Research Environment by the NHS has been both consented and sanitised for research use prior to inclusion.
Contact with the NHS is possible if needed via the Clinical Research Interface (CRI) team.
Since I'm focusing on a specific area, such as a rare disease cohort for my research project, would it be possible to access help sessions from previous years? I am unable to wait until 10/6, for example.
https://re-docs.genomicsengland.co.uk/upcoming/ https://re-docs.genomicsengland.co.uk/rd_cohorts/
Please could you post the link for the past training sessions? Thank you!
https://re-docs.genomicsengland.co.uk/upcoming/#past-training-sessions