Skip to content

Data in Participant Explorer

Data sources

Participant data

Participant clinical data is obtained from the most recent version of the 100kGP data release source data, version 18 (21st December 2023), which can be viewed in LabKey. Some elements in the Participant Explorer UI provide deep links to the source data into LabKey.

The source data are imported into a Postgres database, and partially mapped to a standard data model (HL7 FHIR) using SQL. The Participant Explorer UI operates on top of the FHIR model, but hides the technical details of FHIR (such as element names and extension URLs) to create a user-friendly, intuitive interface. A detailed overview of the mapping from source tables and columns to elements in the UI is given below.

Reference data

Terminology reference data is provided by a FHIR terminology server, developed by the AEHRC. For details of the terminology server instance we are using, see the Terminology Server page.

Conceptual data model

The diagram below depicts the clinical data model used for representing participant data in the Participant Explorer. This model is based on the HL7 FHIR resource model.

FHIR Resource Type Definition (from FHIR) Participant Explorer Label Meaning in terms of the 100kGP dataset
Patient Demographics and other administrative information about an individual receiving care or other health-related services. Participant All consenting participants including probands and their relatives.
Condition A clinical condition, problem, diagnosis, or other event, situation, issue, or clinical concept that has risen to a level of concern. Condition Diagnoses from primary clinical data (including the recruited disease) and secondary data.
Observation Measurements and simple assertions made about a patient Observation Phenotypic observations (HPO terms), Tumour morphology and stage observations
Procedure An action that is or was performed on or for a patient. This can be a physical intervention like an operation, or less invasive like long term services, counselling, or hypnotherapy. Procedure Procedure and operation codes from primary and secondary data
Encounter An interaction between a patient and healthcare provider(s) for the purpose of providing healthcare service(s) or assessing the health status of a patient. Encounter Grouping of conditions, observations and procedures by visit/event
FamilyMemberHistory Significant health conditions for a person related to the patient relevant in the context of care for the patient. Family Member Key details and affected-status of family members of rare disease probands (Note: family members who are also participants will also have a patient/participant record)
DiagnosticReport The findings and interpretation of diagnostic tests performed on patients, groups of patients, devices, and locations, and/or specimens derived from these. Genome Sequence Report
Family Case Report
Sequencing report meta information
GMC exit questionnaire (case status and additional comments)
MedicationAdministration Describes the event of a patient consuming or otherwise being administered a medication. Drugs, Drug Group SACT (chemotherapy) drug administrations

Code systems overview - using the right codes

The following table can help you select the code systems to use when searching by clinical concept, depending on your area of interest and scope. It also indicates whether the "include mapped concepts" feature may be of use. Below the table are examples involving each code system.

Code System Short Name Description Primary Clinical Data (cancer/rare diseases programme specific) Secondary Data (longitudinal data for all participants) Concept Maps Available?
Genomics England Rare Disease Rare disease groups, subgroups and specific diseases for which participants were recruited in the Genomics England 100,000 Genomes project Rare diseases groups, subgroups and specific diseases N/A No
Genomics England Cancer Type Cancer disease types for which participants were recruited in the Genomics England 100,000 Genomes project Cancer disease types N/A No
Genomics England Cancer Subtype Cancer disease subtypes for which participants were recruited in the Genomics England 100,000 Genomes project Cancer disease subtypes N/A No
ICD10 ICD-10, WHO International Classification of Diseases Cancer diagnoses NHS inpatient/outpatient hospital diagnoses
NHS mental health services diagnoses
ONS causes of death
NCRAS radiotherapy diagnoses
NCRAS chemotherapy diagnoses
Yes, SNOMED to ICD10
HPO Human Phenotype Ontology Observed phenotypes (rare disease programme) N/A Yes, SNOMED to ICD10
ICDO ICD-O-3, WHO International Classification of Diseases for Oncology Tumour morphology and topography NCRAS chemotherapy tumour morphology No
OPCS OPCS-4 Classification of Interventions and Procedures N/A Cancer imaging (body site)
Cancer surgery procedures
NHS inpatient/outpatient hospital operations
NCRAS radiotherapy procedures and body site
NCRAS chemotherapy procedures
Yes, SNOMED to OPCS
SNOMED SNOMED CT (UK Edition) Cancer tumour morphology and topography
Imaging procedures (rare disease and cancer)
NHS emergency care
NHS imaging procedures and body site
Yes, SNOMED to ICD10, OPCS and HPO.

Examples

Rare Diseases | Intellectual Disability: selects rare disease participants who were recruited for intellectual disability (including relatives).

Cancer Type | Lung: selects cancer participants who were recruited for lung cancer.

HPO | HP:0012622: Chronic kidney disease: selects rare disease participants with an observed phenotype of chronic kidney disease (including relatives).

ICD10 | C50: Malignant neoplasm of breast: selects any participant with a diagnosis of breast cancer in their medical history (including rare disease probands and relatives, as well as cancer participants recruited for other cancer types).

OPCS | J01: Transplantation of liver: selects _any participant_with a liver transplantation record in their medical history.

SCT | 241620005: Cardiac MRI: selects any participant with a record of a cardiac MRI in their GEL data or general medical history.

SCT | 38341003: Hypertensive disorder (with mapped concepts DISABLED): selects no participants, because SNOMED codes are only available in our data sets for cancer diseases and imaging procedures.

SCT | 38341003: Hypertensive disorder (with mapped concepts ENABLED): adds equivalent ICD10 and HPO codes for hypertensive disorder to the search criteria, and consequently selects any participant with a record of hypertension in their medical history plus rare disease participants with a hypertension phenotype. Disclaimer: the concept maps underlying this feature are not complete and can be inaccurate. Please review the included mapped concepts carefully, when using this feature.

ICDO | 80109: Carcinomatosis plus SCT | 307593001: Carcinomatosis: selects cancer participants with this tumour morphology plus any participant with a record of chemotherapy for this tumour morphology in their medical history (including rare disease probands and relatives, as well as cancer participants recruited for other cancer types). Because tumour morphology/topography may be coded with ICD-O or SNOMED, it is advised to include both code systems when searching for morphology or topography.

Mapping of main programme data

For detailed information on 100kGP source tables and columns, please refer to the data dictionary of the 100kGP Data Release.

Participant

Source Table Source Column Participant Explorer Notes
participant participant_id Participant ID
year_of_birth Year of Birth
participant_phenotypic_sex Phenotypic Sex
participant_type Proband/Relative
programme Programme
normalised_consent_form Consent Form
participant_ethnic_category Ethnic Category
rare_diseases_family_id Family ID
mortality event_date Life Status
death_details death_date Life Status if different from mortality, the value from mortality is used
cancer_participant_disease cancer_disease_type
cancer_disease_sub_type
Recruited Disease
rare_diseases_participant_disease normalised_specific_disease Recruited Disease
rare_diseases_family family_group_type Family Group Type
sequencing_report genome_build Genome Build

Genome sequence report

Source Table Source Column Participant Explorer Notes
sequencing_report delivery_id Delivery ID
plate_key Plate Key
type Type
delivery_version Delivery Version
genome_build Genome Build
delivery_date Delivery Date
path Path
clinic_sample clinic_sample_datetime Sample Date sequencing_report is linked to clinic_sample via the lab_sample_id and clinic_sample_sk in the laboratory_sample table

Rare disease family case report

Source Table Source Column Participant Explorer Notes
gmc_exit_questionnaire interpretation_request_id Interpretation Request ID
case_solved_family Rare Disease Family Case Solved
additional_comments Additional Comments for Family
event_date Position of "Family case report" on the timeline

Family member

Source Table Source Column Participant Explorer Notes
rare_diseases_pedigree_member father_id, mother_id Relationship to Proband First-degree relationships are derived from father_id and mother_id. Others are displayed as "Family Member".
phenotypic_sex, father_id, mother_id Sex
affection_status Affection Status
rare_diseases_pedigree_member_id Pedigree Member ID
family_medical_review_date Medical Review Date

Condition

Source Table Source Column(s) Participant Explorer Code System Notes
av_tumour site_icd10_o2 Code ICD-10 Normalised1
Source Code
stage_best Stage Best
stage_best_system Stage Best System
figo FIGO
dukes Dukes
t_best T Stage
n_best N Stage
m_best M Stage
cancer_invest_sample_pathology primary_diagnosis_icd_code Code ICD-10 Normalised1
Source Code
topography_snomed_ct_code Body Site Code SNOMED CT
cancer_participant_tumour diagnosis_icd_code Code ICD-10 Normalised1
Source Code
integrated_tnm_stage_grouping TNM Stage Group
ajcc_stage TNM Stage Group
final_figo FIGO
modified_dukes_stage Dukes
component_tnm_t T Stage
component_tnm_n N Stage
component_tnm_m M Stage
cancer_participant_disease cancer_disease_type
cancer_disease_sub_type
Code Genomics England
cancer_register_nhsd cancer_site Code ICD-10 Normalised1
Source Code
ecds diagnosis_code_1 - diagnosis_code_12 Code SNOMED CT if diagnosis_qualifier_n /!= '415684004' ("suspected")
hes_apc diag_01 - diag_20 Code ICD-10 Normalised1
Source Code
hes_op diag_01 - diag_12 Code ICD-10 Normalised1
Source Code
mhmd_v4_event
mhldds_event
ic_eve_primarydiagnosis
ic_eve_secondarydiagnosis
Code ICD-10
ICD-10
Normalised1
mhd_primarydiagnosis
mhd_secondarydiagnosis
Source Code
mhsds_medical_history_previous_diagnosis prevdiag
diagschemeinuse
Code ICD-10 / SNOMED Normalised1
If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06
Source Code
mhsds_provisional_diagnosis provdiag
diagschemeinuse
Code ICD-10 / SNOMED Normalised1
If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06
Source Code
mhsds_primary_diagnosis primdiag
diagschemeinuse
Code ICD-10 / SNOMED Normalised1
If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06
Source Code
mhsds_secondary_diagnosis secdiag
diagschemeinuse
Code ICD-10 / SNOMED Normalised1
If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06
Source Code
mhsds_care_activity
mhsds_indirect_activity
codefind
findschemeinuse
Code ICD-10/ SNOMED If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06
Source Code
rare_diseases_participant_disease normalised_disease_group
normalised_disease_sub_group
normalised_specific_disease
Code Genomics England
mortality icd10_underlying_cause Code ICD-10 Normalised1
Source Code
icd10_multiple_cause_code_1 ... 15 Code ICD-10 Normalised1
Source Code
rtds radiotherapydiagnosisicd Code ICD-10 Normalised1
Source Code
sact primary_diagnosis Code ICD-10 Normalised1
Source Code
sact_stage_at_start TNM Stage Group

Observation

Source Table Source Column(s) Participant Explorer Code System Notes
av_tumour histology_coded Code ICD-O-3
histology_coded_desc Description ICD-O-3
site_coded Body Site Code ICD-O-3 if coding_system_desc starts with "ICD-O-3"
site_coded_desc Body Site Description ICD-O-3
stage_best Code STAGE
figo Code STAGE
dukes Code STAGE
t_best
n_best
m_best
Code TNM STAGE Concatenated
cancer_analysis histology_coded Code ICD-O-3 Removing the / character for technical reasons
Source Code
cancer_participant_tumour morphology_snomed_ct_code
morphology_icd_code
Code SNOMED CT
ICD-O-3
topography_snomed_ct_code
topography_snomed_code, topography_snomed_version
topography_icd_code
Body Site Code SNOMED CT
SNOMED CT
ICD-O-3
integrated_tnm_stage_grouping Code STAGE
ajcc_stage Code STAGE
final_figo Code STAGE
modified_dukes_stage Code STAGE
component_tnm_t
component_tnm_n
component_tnm_m
Code TNM STAGE Concatenated
cancer_register_nhsd cancer_type
cancer_behaviour
Code ICD-O-3 Concatenated
rare_diseases_participant_phenotype hpo_id Code HPO filter hpo_present = true
mhsds_care_activity codeobs
obsschemeinuse
Code SNOMED if obsschemeinuse = 3
sact morphology_clean Code ICD-O-3
sact_stage_at_start Code STAGE

Procedure

Source Table Source Column(s) Participant Explorer Code System Notes
av_treatment eventcode Code NCRAS
eventdesc Description
opcs4_code Code OPCS-4
radiocode Code NCRAS
radiodesc Description
imagingcode Code NCRAS
imagingdesc Description
imagingsite Code OPCS-4
cancer_invest_imaging imaging_code_snomed_ct_code Code SNOMED CT
anatomical_site Body Site Code OPCS-4 Split comma-separated values into multiple codes for the same procedure
cancer_surgery primary_procedure Code OPCS-4 Ignore '.'
Source Code
rare_diseases_imaging procedure_other_snomed_ct Code SNOMED CT
ecds treatment_code_1 - treatment_code_12 Code SNOMED CT
hes_apc opertn_01-24 Code
Body Site Code
OPCS-4 Z-chapter codes mapped to body site and grouped with preceding non-Z-chapter code
hes_op opertn_01-24 Code
Body Site Code
OPCS-4 Z-chapter codes mapped to body site and grouped with preceding non-Z-chapter code
rtds primaryprocedureopcs Code OPCS-4
rttreatmentanatomicalsite Body Site Code OPCS-4
sact opcs_procurement_code
opcs_delivery_code
Code OPCS-4 If 3 digits: prefix with "X"
Uppercase
ignore "N/A"
Source Code
did did_snomedct_code Code SNOMED CT
ic_sub_syscomp_id
ic_sub_sys_id
ic_system_id
ic_sub_region_id
ic_region_id
Body Site Code SNOMED CT Only using the most specific system code and the most specific region code. I.e., when both a region_id and sub_region_id are present, only include the sub_region_id in the body site coding.

Medication administration

Source Table Source Column(s) Participant Explorer Code System Notes
sact drug_group Code SACT Drug Group Convert to title-case
Source Code

Encounter

Source Tables Source Column Participant Explorer Notes
av_treatment eventdate Encounter Date
"National Cancer Registration" Encounter Type
av_tumour diagnosisdatebest Encounter Date
"National Cancer Registration" Encounter Type
participant registration_date
date_of_consent
Encounter Date registration_date if available; otherwise date_of_consent
Also used for recruited diseases
"Genomics England" Encounter Type
cancer_participant_tumour diagnosis_date Encounter Date
"Genomics England" Encounter Type
cancer_invest_sample_pathology event_date Encounter Date
"Genomics England" Encounter Type
cancer_invest_imaging imaging_date Encounter Date
"Genomics England" Encounter Type
cancer_invest_sample_pathology event_date Encounter Date
"Genomics England" Encounter Type
cancer_surgery procedure_date Encounter Date
"Genomics England" Encounter Type
cancer_analysis tumour_clinical_sample_time Encounter Date
"Genomics England" Encounter Type
cancer_register_nhsd event_date Encounter Date
"National Cancer Registration" Encounter Type
rare_diseases_participant_phenotype phenotype_report_date Encounter Date
"Genomics England" Encounter Type
rare_diseases_imaging date Encounter Date
"Genomics England" Encounter Type
ecds arrival_date
arrival_time
Encounter Date
departure_date
departure_time
Encounter Date
"Emergency" Encounter Type
hes_apc admidate Encounter Date
"Inpatient" Encounter Type
hes_op "Outpatient" Encounter Type
apptdate Encounter Date
did "Diagnostic Imaging" Encounter Type
did_date3 Encounter Date
mhmd_v4_event
mhldds_event
mhd_eventdate Encounter Date
"Mental Health Services" Encounter Type
mhsds_medical_history_previous_diagnosis
mhsds_primary_diagnosis
mhsds_secondary_diagnosis
diagdate Encounter Date
"Mental Health Services" Encounter Type
mhsds_provisional_diagnosis provdiagdate Encounter Date
"Mental Health Services" Encounter Type
mhsds_indirect_activity indirectactdate Encounter Date
"Mental Health Services" Encounter Type
mhsds_care_activity carecontdate Encounter Date
"Mental Health Services" Encounter Type
mortality event_date Encounter Date
"Office of National Statistics" Encounter Type
rtds "Radiotherapy" Encounter Type
decisiontotreatdate Encounter Date For diagnosis codes
rtds "Radiotherapy" Encounter Type
proceduredate Encounter Date
sact date_decision_to_treat Encounter Date For diagnosis, morphology and staging codes
"Chemotherapy" Encounter Type
sact administration_date Encounter Date
"Chemotherapy" Encounter Type

  1. ICD-10 codes are normalised so that they match the ICD-10 reference data. This includes the removal of any non-numeric characters other than the first character, and inserting a dot at the fourth position (for example, a coded value of "E149D" is normalised to "E14.9"). The original values can be obtained via the "Source Code" column.