Skip to content

Oxford Nanopore Long-read sequencing pilot

This describes an early pilot of long-read Oxford Nanopore data made available in the RE. Genomics England, in partnership with the Sanger Institute, has been assessing the advantages of long-reads sequencing technologies over short-reads whole genome sequencing. The dataset consists of human genomes from a subset of 100,000 Genomes Project participants sequenced with Oxford Nanopore long reads.

Sequencing protocol

Germline DNA from a subset of 100,000 Genome Project participants was depleted of low molecular weight DNA (<10 Kb) before library preparation. Libraries for ONT sequencing were prepared with the protocol indicated in the library_prep field of the lrs_laboratory_sample table in LabKey. Data were acquired with the PromethION Beta for 42-60hrs. Full details of the protocol can be found here: v1_protocol_ONT_LSK109.pdf and additional information about the samples and the sequencing is available in the lrs_laboratory_sample LabKey table.

Data processing

Base calling was performed on sequencer using Guppy (versions 3.05-3.2.10) in high accuracy mode. Fastq files were aligned to GCA_000001405.15_GRCh38_no_alt_analysis_set.fa (available in the RE as /public_data_resources/reference/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna) with minimap2 (version 2.17). FAST5 (raw output from the ONT sequencer in a HDF5 format), FASTQ and BAM files are provided for use and their location can be found in the lrs_sequencing_data LabKey table.