Skip to content

R, RStudio, and R libraries

R and RStudio are available within the Research Environment. You can use the latest version of R, as well as specifying previous versions if you like. It is also possible to install R packages from both CRAN and BioConductor using the internal mirror.

Follow the steps below to configure your environment to install R packages.

Licensing considerations

Please note if you install libraries yourself, you will be solely and fully responsible for acquiring any licences required for the use of and access to the relevant software package. GEL expect all software to be correctly licensed by you where the self-installation route is employed. In no event shall GEL be liable to you or any third parties for any claim, damages or other liability, whether such liability arises in contract, tort (including negligence), breach of statutory duty, misrepresentation, restitution and on an indemnity basis or otherwise, arising from, out of or in connection with software self-installed by the researcher or the use or other dealings by the researcher in the software.

Any links to third party software available on this User Guide are provided “as is” without warranty of any kind, either expressed or implied, and such software is to be used at your own risk. No advice or information, whether oral or written, obtained by you from us or from this User Guide shall create any warranty in relation to the software.

Selecting a version of R to use

Default version of R

The default install of R and R within Rstudio on the Desktop is version 4.0.2. While you are free to use this version of R and Rstudio this version will not have all packages available. If you wish to use the pre-installed packages, you will need to use the approach below and manually load your preferred version of R.

Specifying another version of R

To use a specific version of R in RStudio, open the terminal app on the Desktop and enter the following commands:

1
2
3
module avail R/
module load R/4.0.2 #select your version here
rstudio

This will firstly scan for all available versions of R and then load RStudio using R 4.0.2.

This is important, as there are different libraries available for the different versions of R. For more information on loading and installing R packages, see Installing R packages from CRAN on this page.

Configuration of R

Because the Research Environment and the HPC are closed environments, you will have to perform a small number of steps to correctly configure your R instances. This is required to access databases such as our internal CRAN mirror, Bioconductor, and other rerouting.

Please follow the steps below to configure your R:

  1. Open the terminal application from the Desktop in the Research Environment.
  2. Type in (or copy-and-paste) the following lines to the file open in the terminal:

    cp -rf ~/gel_data_resources/example_config_files/Inuvika/. ./
    
  3. Done!

The command prompts a warning message. This is expected and normal and just means that it has copied the timestamps of the original files because it comes from a mounted file system. You will not see this prompt in the HPC configuration.

Contents of the added files

For the configuration of R on the Research Environment, three files are added. The contents are displayed here for reference:

no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com,.gel.zone"
myrepo = getOption("repos")
myrepo["CRAN"] = "https://artifactory.aws.gel.ac/artifactory/cran"
options(repos = myrepo, BioC_mirror = "https://artifactory.aws.gel.ac:443/artifactory/bioconductor.org")
rm(myrepo)
machine labkey-embassy.gel.zone
login yourusername
password yourPasswordHere

If you wish to setup your R instances on the HPC, please follow the steps below.

  1. Open the terminal application from the Desktop in the Research Environment.
  2. Login to the HPC
  3. Type in (or copy-and-paste) the following lines to the file open in the terminal:

    cp -rf /gel_data_resources/example_config_files/Helix/. ./
    
  4. Done!

The command used for the HPC is slightly different (~) and refers to different files and folders. This is due to how the file systems are mounted on the HPC vs on the Research Environment sessions. The .netrc files remain the same, however the .Renviron file will be different.

Contents of the added files

For the configuration of R on the the HPC, three files are added. The contents are displayed here for reference:

http_proxy=http://pfsense.int.corp.gel.ac:3128
ftp_proxy=http://pfsense.int.corp.gel.ac:3128
rsync_proxy=http://pfsense.int.corp.gel.ac:3128
https_proxy=http://pfsense.int.corp.gel.ac:3128
no_proxy=localhost,127.0.0.1,localaddress,.localdomain.com,.gel.zone,.cluster
myrepo = getOption("repos")
myrepo["CRAN"] = "https://artifactory.aws.gel.ac/artifactory/cran"
options(repos = myrepo, BioC_mirror = "https://artifactory.aws.gel.ac:443/artifactory/bioconductor.org")
rm(myrepo)
machine labkey-embassy.gel.zone
login yourusername
password yourPasswordHere

Installing R packages from CRAN

You can install R packages yourself within the Research Environment from CRAN as we have an internal mirror.

You can only install R packages from the Desktop environment. You cannot install R packages directly on the HPC. However, we do already have various packages pre-installed. Please see the "Loading R packages" section below.

Loading R packages

We have provided a range of R packages, which you can load with library(library_name) or by selecting them in the "Packages" tab in RStudio.

Please check these relevant FAQs for problems with packages:

If the package of your interest is not available under https://artifactory.gel.zone/artifactory/cran/src/contrib, please raise a ticket at the Genomics England Service Desk so it can be installed.

It is your responsibility to resolve any dependencies by installing other relevant packages.

Installation from the desktop environment

If you want to to install packages yourself, follow the steps outlined above in Configuration of R.

We recommend to install your packages in your /re_gecip/yourDomain/ or /re_df/yourDomain/ folders so that they will be accessible on the desktop and HPC.

Afterwards, you will be able to install packages accordingly:

  1. Make a folder where you want to store your R packages for example: ~/re_gecip/yourDomain/R_packages

  2. Install the package and specify the installation path with lib: install.packages("", lib="~/re_gecip/yourDomain/R_packages")

  3. Load libraries: library(, lib="~/re_gecip/yourDomain/R_packages")

All R packages that are located on GitHub require Genomics England admins to install them. Please submit a service desk ticket if you require this.

Loading from the HPC environment

To load a pre-installed R package from the HPC environment you can use the following command: library(, lib="/re_gecip/yourDomain/R_packages"). Notice the preceding / in the HPC environment compared to ~/ in the Desktop environment.

Installing and configuring packages from BioConductor

You can also install BioConductor packages from within the Research Environment after a once-off configuration as shown in Configuration of R. Follow the same setup as CRAN packages by installing them to a shared folder on the HPC (such as re_gecip).

Open R (or RStudio) and run the following:

1
2
3
library("BiocManager")
BiocManager::install("<package_name>")
library(<package_name>)

Open R (or RStudio) and run the following

1
2
3
source("https://bioconductor.org/biocLite.R")
biocLite("<package_name>")
library(<package_name>)

Plotting in R on the HPC

When R is run on the HPC as a module, it will not be able to output plots due to the absence of a Graphical User Interface within the HPC. You might see errors such as: Unable to start device PNG or Unable to open connection to X11 display.

This can be solved by using an X Virtual Frame Buffer to run R in. The below is an example of how to do this:

module load lang/R/<version>
xvfb-run -a R

If, however, you do not want to write xvfb-run R each time, then you can set up an alias in your .bashrc file that will do this for you. Add the following line to your .bashrc file:

alias R='xvfb-run -a R'

Within your R script there are a number of possible ways of saving plots. The following example uses base R:

1
2
3
png("plot.png")
plot(1)
dev.off()

This will create a .png file called plot.png in the current working directory with your data plotted.

Alternatively, the tidyverse ggplot2 package can be used to save the plots to a variable which can be saved with the built-in ggsave() function. You should be able to save the generated plots bypassing the need to display them by using commands such as the following example within your RSCRIPT:

1
2
3
library(ggplot2)
fig_1 <- ggplot(cars, aes(x = speed, y = dist)) + geom_point()
ggsave(fig_1, type = "cairo", file="fig_1.<extension>")

Note that tidyverse and base R can at times interfere with each other so we would recommend selecting one strategy for your plotting needs.

If you will always be loading the same version of R you will be able to create a bash function that will help you access the the functionality with a single command:

1
2
3
4
5
function r_headless(){
module purge
module load lang/R/<version>
xvfb-run -a Rscript $1
}

The function can then be called on an R script as follows:

r_headless script.R

We would recommend that you save the function to your .bashrc file so that the function is globally available.


Last update: November 17, 2023