Skip to content

Getting started with the Data discovery portal

Log onto the Research Environment Desktop with your existing user ID and password:

  1. Click on the Data Discovery icon:

  2. The Kibana Welcome and Login screen appears:

  3. Enter your login credentials and the Data Discovery Dashboard menu appears:

  4. Then select the Dashboard of choice

100kGP Cohort Overview dashboard - (example screenshots)

100kGP Cohort Rare disease dashboard (example screenshots)

100kGP Cohort Cancer dashboard (example screenshots)

100kGP Cohort All participants (example screenshots)

Dashboard screen overview

Dashboard screens: collapsing the side-bar

Introduction to drop-down filter controls

Below is a description of drop-down filters on the left hand side of the 100kGP cohort dashboards (except for the 100kGP Overview Dashboard). For each of the drop-down filter controls listed below, there is the description, its data source and why you're likely to select it.

Note, all the visualisations are data driven, therefore the options you select from the drop-down filter control list will lead to changes in the corresponding charts and graphs.

Partial list of drop-down filters available

Description of drop-down filter controls common to cancer and rare disease - 100kGP Cohort

Filter Description Source When to use
Genome build Currently there are two Genome reference builds for this drop-down filter, Build GRCh37 and Build GRCh38. Note, that some participants will have genome sequences in both Build GRCh37 and Build GRCh38. The data for the genome build comes from the Bioinformatics Pipeline. You want to find out which reference build your participants of interest have had their genomes aligned against.
Stated Gender This drop-down filter has several options for example - Male, Female, Not Known and Not Specified. GMC Recruited - Primary Data. You want to find out the gender of participants.
Stated Ethnic Category This drop-down filter allows you to select ethnicity of participants, note ethnicity is indicated as stated by the participant. GMC Recruited - Primary Data. For example, you want to find out if a Rare Disease is confined to specific ethnicities.
Life Status This drop-down filter has two options, deceased and not reported. Combination of recruited GMC data and secondary data from NHSE. You want to find out treatment outcomes for participants.
GMC Trust This drop-down filter allows you to select from the 13 Genomic Medicine Centres across England. GMC Recruited Primary Data You want to find at a high-level geographic distribution of participants. (Includes Scotland, NI and Wales)
Current age range This drop-down filter lists the age range to select. Derived from GMC Recruited Date of Birth You want to find current age range of participants in your cohort of interest.
Current age This drop-down filter lists participants by age. GMC Recruited - Primary Data (derived from date of birth and date of Main Programme Data Release). You want to find the current age of participants in your cohort of interest.
Diagnosis (type to search) Select participants with a specific diagnosis. Secondary data from NHSE You want to build a cohort with participants who have a specific diagnosis. Note the search field is case sensitive. See example in section 3.4.7 below.
Procedure (type to search) Select participants who have undergone a specific procedure. Secondary data from NHSE You want to build a cohort with participants that have undergone a specific procedure. Note the search field is case sensitive. See example below.

Rare disease drop-down filter controls - 100kGP Cohort

Filter Description Source When to use
Disease Group Top level disease grouping for participants recruited to the rare disease arm of the project. GMC - Recruited - Primary Data. You want to find out the number of participants in the top level disease grouping.
Disease Sub Group Breakdown of the top-level disease grouping for participants recruited to the rare disease arm of the project. GMC Recruited - Primary Data. You need a more granular breakdown of the top level disease grouping.
Specific Disease This drop-down filter contains the lowest level breakdown of diseases from the Disease subgroup for participants recruited to the rare disease arm of the project. GMC Recruited - Primary Data. You want to find out the number of participants affected by a specific disease(s).
Proband/Relative This drop-down filter selects participants that are recruited to the rare disease arm of the project that are Probands or Relatives of Probands. GMC Recruited - Primary Data. You want to find out participants who are Probands or Relatives.
Affected Status There are two options available for this drop-down filter, Affected and Unaffected. GMC Recruited - Primary Data. You want to find out members of a family that are either affected or not affected by the rare disease of interest which the participant was recruited for.
Family Group Type This drop-down filter describes the family setup e.g. Trio with mother and father, Duo with mother or father, Singleton. GMC Recruited - Primary Data. You want to gain some insight on family history and relationship with individual affected.
HPO (type to search) Select HPO code(s) describing phenotypic abnormalities encountered in human disease. GMC – Recruited Primary Data You are interested in participants with selected HPO code(s) You only need to type in the first few characters of either the disease or HPO code and a drop down list of results appears. Note the search field is case sensitive. Starting your search term with a lower-case character, the search term will appear within the search results e.g. Type in ‘diabetes’ in the search field hpo (type to search):

Starting your search term with an upper-case character, the search term will appear at the start of the line within the search results. E.g. type in ‘Diabetes’ in the search field hpo (type to search):
Gene (type to search) Select participants who have tiered variant on gene of interest. Genomics England Bioinformatics Pipeline You want to build a cohort with participants who have tiered variant on your gene of interest. Note the search field is case sensitive. See the example above.

Cancer drop-down filter controls - 100kGP Cohort

Filter Description Source When to use
Genome Quality Individuals with Quality Passed Interpreted Genomes. Selection of this filter on the Cancer dashboard will select only participants who have a genome that has been sequenced on build GRCh38 that has been through the Genomics England Bioinformatics Interpretation pipeline and has passed checks for quality. Note that some of these participants may also have genomes sequenced on build GRCh37. These genome builds will also show in the Genome Build pie chart on the dashboard. The data for the Genome build comes from the Bioinformatics Pipeline. You want to select participants whose Genomes have been interpreted and passed for quality.
Cancer Primary Site This drop-down filter allows you to select one or more primary cancer sites. GMC Recruited - Primary Data. You want to select participants whose genomes have been passed for quality.
Cancer Sub-type This filter allows you to select one or more sub- cancer types. The Cancer sub-type value is concatenation of Cancer primary and the Cancer sub-type e.g. Lung_Adenocarcinoma.
GMC Recruited - Primary Data. You want to select participants affected with one or more cancer sub-types. You want to select participants affected with one or more cancer sub-types.

Last update: November 20, 2023