Welcome to the Genomics England Research Environment (RE) documentation!
The RE is a virtual desktop accessible through Amazon WorkSpaces (AWS) where you can access and analyse Genomics England data. This getting started section of the documentation has a few guidelines and suggestions to help you get to grips with our environment. Have a look at through the menu on the left to learn more before you get started.
What is the Genomics England Research Environment?¶
The RE is a virtual computer that you access through AWS. It contains all the data for the projects managed by Genomics England and all the tools you can use to access and analyse the data. Data cannot be exported from the RE, you must carry out all your analyses then only export the results.
You can view your domain’s shared folders in the Home directory, which acts as a file explorer within the Research Environment.
Starting your research project¶
You will need to register your project on the Research Registry. You will only be allowed to export data that is associated with a registered project, and that project must have been registered for at least 90 days before you can export anything. Therefore, we recommend registering your project as early as possible.
We have a section of task-based documentation, which gives you links to the documentation you might want to read in order to complete the most common tasks.
Data within the TRE¶
The RE contains all the data from Genomics England. The data files are contained on a read-only file-system that is mounted on both the virtual desktop and the HPC.
You can only export summary data for inclusion within dissertations, theses, reports or publications. Exports must be approved by Genomics England, which is done using the Airlock application.
You can only use data from consented participants for your research. As participants have the right to withdraw consent at any point in time, it is your responsibility to ensure that you are using the most up to date participant list. You must always start new work on the most recent data release, you will then be authorised to continue using that release for the length of your project. Your Airlock export requests may be denied if you use data from participants that have withdrawn their consent.
Structure of the TRE¶
The RE consists of three broad sections: the mounted file system which will contains the genomic data, the virtual desktop and the high performance computing cluster (HPC).
We encourage you to carry out exploratory and development work using the applications available on virtual desktop and then run the full analysis on the HPC which is accessed via the terminal emulator.