Data in Participant Explorer¶
Data sources¶
Participant data¶
Participant clinical data is obtained from the most recent version of the 100kGP data release source data, version 19 (31st October 2024)
, which can be viewed in LabKey. Some elements in the Participant Explorer UI provide deep links to the source data into LabKey.
The source data are imported into a Postgres database, and partially mapped to a standard data model (HL7 FHIR) using SQL. The Participant Explorer UI operates on top of the FHIR model, but hides the technical details of FHIR (such as element names and extension URLs) to create a user-friendly, intuitive interface. A detailed overview of the mapping from source tables and columns to elements in the UI is given below.
Reference data¶
Terminology reference data is provided by a FHIR terminology server, developed by the AEHRC. For details of the terminology server instance we are using, see the Terminology Server page.
Conceptual data model¶
The diagram below depicts the clinical data model used for representing participant data in the Participant Explorer. This model is based on the HL7 FHIR resource model.
FHIR Resource Type | Definition (from FHIR) | Participant Explorer Label | Meaning in terms of the 100kGP dataset |
---|---|---|---|
Patient | Demographics and other administrative information about an individual receiving care or other health-related services. | Participant | All consenting participants including probands and their relatives. |
Condition | A clinical condition, problem, diagnosis, or other event, situation, issue, or clinical concept that has risen to a level of concern. | Condition | Diagnoses from primary clinical data (including the recruited disease) and secondary data. |
Observation | Measurements and simple assertions made about a patient | Observation | Phenotypic observations (HPO terms), Tumour morphology and stage observations |
Procedure | An action that is or was performed on or for a patient. This can be a physical intervention like an operation, or less invasive like long term services, counselling, or hypnotherapy. | Procedure | Procedure and operation codes from primary and secondary data |
Encounter | An interaction between a patient and healthcare provider(s) for the purpose of providing healthcare service(s) or assessing the health status of a patient. | Encounter | Grouping of conditions, observations and procedures by visit/event |
FamilyMemberHistory | Significant health conditions for a person related to the patient relevant in the context of care for the patient. | Family Member | Key details and affected-status of family members of rare disease probands (Note: family members who are also participants will also have a patient/participant record) |
DiagnosticReport | The findings and interpretation of diagnostic tests performed on patients, groups of patients, devices, and locations, and/or specimens derived from these. | Genome Sequence Report Family Case Report |
Sequencing report meta information GMC exit questionnaire (case status and additional comments) |
MedicationAdministration | Describes the event of a patient consuming or otherwise being administered a medication. | Drugs, Drug Group | SACT (chemotherapy) drug administrations |
Code systems overview - using the right codes¶
The following table can help you select the code systems to use when searching by clinical concept, depending on your area of interest and scope. It also indicates whether the "include mapped concepts" feature may be of use. Below the table are examples involving each code system.
Code System Short Name | Description | Primary Clinical Data (cancer/rare diseases programme specific) | Secondary Data (longitudinal data for all participants) | Concept Maps Available? |
---|---|---|---|---|
Genomics England Rare Disease | Rare disease groups, subgroups and specific diseases for which participants were recruited in the Genomics England 100,000 Genomes project | Rare diseases groups, subgroups and specific diseases | N/A | No |
Genomics England Cancer Type | Cancer disease types for which participants were recruited in the Genomics England 100,000 Genomes project | Cancer disease types | N/A | No |
Genomics England Cancer Subtype | Cancer disease subtypes for which participants were recruited in the Genomics England 100,000 Genomes project | Cancer disease subtypes | N/A | No |
ICD10 | ICD-10, WHO International Classification of Diseases | Cancer diagnoses | NHS inpatient/outpatient hospital diagnoses NHS mental health services diagnoses ONS causes of death NCRAS radiotherapy diagnoses NCRAS chemotherapy diagnoses |
Yes, SNOMED to ICD10 |
HPO | Human Phenotype Ontology | Observed phenotypes (rare disease programme) | N/A | Yes, SNOMED to ICD10 |
ICDO | ICD-O-3, WHO International Classification of Diseases for Oncology | Tumour morphology and topography | NCRAS chemotherapy tumour morphology | No |
OPCS | OPCS-4 Classification of Interventions and Procedures | N/A | Cancer imaging (body site) Cancer surgery procedures NHS inpatient/outpatient hospital operations NCRAS radiotherapy procedures and body site NCRAS chemotherapy procedures |
Yes, SNOMED to OPCS |
SNOMED | SNOMED CT (UK Edition) | Cancer tumour morphology and topography Imaging procedures (rare disease and cancer) |
NHS emergency care NHS imaging procedures and body site |
Yes, SNOMED to ICD10, OPCS and HPO. |
Examples¶
Rare Diseases | Intellectual Disability: selects rare disease participants who were recruited for intellectual disability (including relatives).
Cancer Type | Lung: selects cancer participants who were recruited for lung cancer.
HPO | HP:0012622: Chronic kidney disease: selects rare disease participants with an observed phenotype of chronic kidney disease (including relatives).
ICD10 | C50: Malignant neoplasm of breast: selects any participant with a diagnosis of breast cancer in their medical history (including rare disease probands and relatives, as well as cancer participants recruited for other cancer types).
OPCS | J01: Transplantation of liver: selects _any participant_with a liver transplantation record in their medical history.
SCT | 241620005: Cardiac MRI: selects any participant with a record of a cardiac MRI in their GEL data or general medical history.
SCT | 38341003: Hypertensive disorder (with mapped concepts DISABLED): selects no participants, because SNOMED codes are only available in our data sets for cancer diseases and imaging procedures.
SCT | 38341003: Hypertensive disorder (with mapped concepts ENABLED): adds equivalent ICD10 and HPO codes for hypertensive disorder to the search criteria, and consequently selects any participant with a record of hypertension in their medical history plus rare disease participants with a hypertension phenotype. Disclaimer: the concept maps underlying this feature are not complete and can be inaccurate. Please review the included mapped concepts carefully, when using this feature.
ICDO | 80109: Carcinomatosis plus SCT | 307593001: Carcinomatosis: selects cancer participants with this tumour morphology plus any participant with a record of chemotherapy for this tumour morphology in their medical history (including rare disease probands and relatives, as well as cancer participants recruited for other cancer types). Because tumour morphology/topography may be coded with ICD-O or SNOMED, it is advised to include both code systems when searching for morphology or topography.
Mapping of main programme data¶
For detailed information on 100kGP source tables and columns, please refer to the data dictionary of the 100kGP Data Release.
Participant¶
Source Table | Source Column | Participant Explorer | Notes |
---|---|---|---|
participant | participant_id | Participant ID | |
year_of_birth | Year of Birth | ||
participant_phenotypic_sex | Phenotypic Sex | ||
participant_type | Proband/Relative | ||
programme | Programme | ||
normalised_consent_form | Consent Form | ||
participant_ethnic_category | Ethnic Category | ||
rare_diseases_family_id | Family ID | ||
mortality | event_date | Life Status | |
death_details | death_date | Life Status | if different from mortality, the value from mortality is used |
cancer_participant_disease | cancer_disease_type cancer_disease_sub_type |
Recruited Disease | |
rare_diseases_participant_disease | normalised_specific_disease | Recruited Disease | |
rare_diseases_family | family_group_type | Family Group Type | |
sequencing_report | genome_build | Genome Build |
Genome sequence report¶
Source Table | Source Column | Participant Explorer | Notes |
---|---|---|---|
sequencing_report | delivery_id | Delivery ID | |
plate_key | Plate Key | ||
type | Type | ||
delivery_version | Delivery Version | ||
genome_build | Genome Build | ||
delivery_date | Delivery Date | ||
path | Path | ||
clinic_sample | clinic_sample_datetime | Sample Date | sequencing_report is linked to clinic_sample via the lab_sample_id and clinic_sample_sk in the laboratory_sample table |
Rare disease family case report¶
Source Table | Source Column | Participant Explorer | Notes |
---|---|---|---|
gmc_exit_questionnaire | interpretation_request_id | Interpretation Request ID | |
case_solved_family | Rare Disease Family Case Solved | ||
additional_comments | Additional Comments for Family | ||
event_date | Position of "Family case report" on the timeline |
Family member¶
Source Table | Source Column | Participant Explorer | Notes |
---|---|---|---|
rare_diseases_pedigree_member | father_id, mother_id | Relationship to Proband | First-degree relationships are derived from father_id and mother_id. Others are displayed as "Family Member". |
phenotypic_sex, father_id, mother_id | Sex | ||
affection_status | Affection Status | ||
rare_diseases_pedigree_member_id | Pedigree Member ID | ||
family_medical_review_date | Medical Review Date |
Condition¶
Source Table | Source Column(s) | Participant Explorer | Code System | Notes |
---|---|---|---|---|
av_tumour | site_icd10_o2 | Code | ICD-10 | Normalised1 |
Source Code | ||||
stage_best | Stage Best | |||
stage_best_system | Stage Best System | |||
figo | FIGO | |||
dukes | Dukes | |||
t_best | T Stage | |||
n_best | N Stage | |||
m_best | M Stage | |||
cancer_invest_sample_pathology | primary_diagnosis_icd_code | Code | ICD-10 | Normalised1 |
Source Code | ||||
topography_snomed_ct_code | Body Site Code | SNOMED CT | ||
cancer_participant_tumour | diagnosis_icd_code | Code | ICD-10 | Normalised1 |
Source Code | ||||
integrated_tnm_stage_grouping | TNM Stage Group | |||
ajcc_stage | TNM Stage Group | |||
final_figo | FIGO | |||
modified_dukes_stage | Dukes | |||
component_tnm_t | T Stage | |||
component_tnm_n | N Stage | |||
component_tnm_m | M Stage | |||
cancer_participant_disease | cancer_disease_type cancer_disease_sub_type |
Code | Genomics England | |
cancer_register_nhsd | cancer_site | Code | ICD-10 | Normalised1 |
Source Code | ||||
ecds | diagnosis_code_1 - diagnosis_code_12 | Code | SNOMED CT | if diagnosis_qualifier_n /!= '415684004' ("suspected") |
hes_apc | diag_01 - diag_20 | Code | ICD-10 | Normalised1 |
Source Code | ||||
hes_op | diag_01 - diag_12 | Code | ICD-10 | Normalised1 |
Source Code | ||||
mhmd_v4_event mhldds_event |
ic_eve_primarydiagnosis ic_eve_secondarydiagnosis |
Code | ICD-10 ICD-10 |
Normalised1 |
mhd_primarydiagnosis mhd_secondarydiagnosis |
Source Code | |||
mhsds_medical_history_previous_diagnosis | prevdiag diagschemeinuse |
Code | ICD-10 / SNOMED | Normalised1 If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06 |
Source Code | ||||
mhsds_provisional_diagnosis | provdiag diagschemeinuse |
Code | ICD-10 / SNOMED | Normalised1 If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06 |
Source Code | ||||
mhsds_primary_diagnosis | primdiag diagschemeinuse |
Code | ICD-10 / SNOMED | Normalised1 If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06 |
Source Code | ||||
mhsds_secondary_diagnosis | secdiag diagschemeinuse |
Code | ICD-10 / SNOMED | Normalised1 If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06 |
Source Code | ||||
mhsds_care_activity mhsds_indirect_activity |
codefind findschemeinuse |
Code | ICD-10/ SNOMED | If diagschemeinuse is 1, 2, 02, ID, 4, 6 or 06 |
Source Code | ||||
rare_diseases_participant_disease | normalised_disease_group normalised_disease_sub_group normalised_specific_disease |
Code | Genomics England | |
mortality | icd10_underlying_cause | Code | ICD-10 | Normalised1 |
Source Code | ||||
icd10_multiple_cause_code_1 ... 15 | Code | ICD-10 | Normalised1 | |
Source Code | ||||
rtds | radiotherapydiagnosisicd | Code | ICD-10 | Normalised1 |
Source Code | ||||
sact | primary_diagnosis | Code | ICD-10 | Normalised1 |
Source Code | ||||
sact_stage_at_start | TNM Stage Group |
Observation¶
Source Table | Source Column(s) | Participant Explorer | Code System | Notes |
---|---|---|---|---|
av_tumour | histology_coded | Code | ICD-O-3 | |
histology_coded_desc | Description | ICD-O-3 | ||
site_coded | Body Site Code | ICD-O-3 | if coding_system_desc starts with "ICD-O-3" | |
site_coded_desc | Body Site Description | ICD-O-3 | ||
stage_best | Code | STAGE | ||
figo | Code | STAGE | ||
dukes | Code | STAGE | ||
t_best n_best m_best |
Code | TNM STAGE | Concatenated | |
cancer_analysis | histology_coded | Code | ICD-O-3 | Removing the / character for technical reasons |
Source Code | ||||
cancer_participant_tumour | morphology_snomed_ct_code morphology_icd_code |
Code | SNOMED CT ICD-O-3 |
|
topography_snomed_ct_code topography_snomed_code, topography_snomed_version topography_icd_code |
Body Site Code | SNOMED CT SNOMED CT ICD-O-3 |
||
integrated_tnm_stage_grouping | Code | STAGE | ||
ajcc_stage | Code | STAGE | ||
final_figo | Code | STAGE | ||
modified_dukes_stage | Code | STAGE | ||
component_tnm_t component_tnm_n component_tnm_m |
Code | TNM STAGE | Concatenated | |
cancer_register_nhsd | cancer_type cancer_behaviour |
Code | ICD-O-3 | Concatenated |
rare_diseases_participant_phenotype | hpo_id | Code | HPO | filter hpo_present = true |
mhsds_care_activity | codeobs obsschemeinuse |
Code | SNOMED | if obsschemeinuse = 3 |
sact | morphology_clean | Code | ICD-O-3 | |
sact_stage_at_start | Code | STAGE |
Procedure¶
Source Table | Source Column(s) | Participant Explorer | Code System | Notes |
---|---|---|---|---|
av_treatment | eventcode | Code | NCRAS | |
eventdesc | Description | |||
opcs4_code | Code | OPCS-4 | ||
radiocode | Code | NCRAS | ||
radiodesc | Description | |||
imagingcode | Code | NCRAS | ||
imagingdesc | Description | |||
imagingsite | Code | OPCS-4 | ||
cancer_invest_imaging | imaging_code_snomed_ct_code | Code | SNOMED CT | |
anatomical_site | Body Site Code | OPCS-4 | Split comma-separated values into multiple codes for the same procedure | |
cancer_surgery | primary_procedure | Code | OPCS-4 | Ignore '.' |
Source Code | ||||
rare_diseases_imaging | procedure_other_snomed_ct | Code | SNOMED CT | |
ecds | treatment_code_1 - treatment_code_12 | Code | SNOMED CT | |
hes_apc | opertn_01-24 | Code Body Site Code |
OPCS-4 | Z-chapter codes mapped to body site and grouped with preceding non-Z-chapter code |
hes_op | opertn_01-24 | Code Body Site Code |
OPCS-4 | Z-chapter codes mapped to body site and grouped with preceding non-Z-chapter code |
rtds | primaryprocedureopcs | Code | OPCS-4 | |
rttreatmentanatomicalsite | Body Site Code | OPCS-4 | ||
sact | opcs_procurement_code opcs_delivery_code |
Code | OPCS-4 | If 3 digits: prefix with "X" Uppercase ignore "N/A" |
Source Code | ||||
did | did_snomedct_code | Code | SNOMED CT | |
ic_sub_syscomp_id ic_sub_sys_id ic_system_id ic_sub_region_id ic_region_id |
Body Site Code | SNOMED CT | Only using the most specific system code and the most specific region code. I.e., when both a region_id and sub_region_id are present, only include the sub_region_id in the body site coding. |
Medication administration¶
Source Table | Source Column(s) | Participant Explorer | Code System | Notes |
---|---|---|---|---|
sact | drug_group | Code | SACT Drug Group | Convert to title-case |
Source Code |
Encounter¶
Source Tables | Source Column | Participant Explorer | Notes |
---|---|---|---|
av_treatment | eventdate | Encounter Date | |
"National Cancer Registration" | Encounter Type | ||
av_tumour | diagnosisdatebest | Encounter Date | |
"National Cancer Registration" | Encounter Type | ||
participant | registration_date date_of_consent |
Encounter Date | registration_date if available; otherwise date_of_consent Also used for recruited diseases |
"Genomics England" | Encounter Type | ||
cancer_participant_tumour | diagnosis_date | Encounter Date | |
"Genomics England" | Encounter Type | ||
cancer_invest_sample_pathology | event_date | Encounter Date | |
"Genomics England" | Encounter Type | ||
cancer_invest_imaging | imaging_date | Encounter Date | |
"Genomics England" | Encounter Type | ||
cancer_invest_sample_pathology | event_date | Encounter Date | |
"Genomics England" | Encounter Type | ||
cancer_surgery | procedure_date | Encounter Date | |
"Genomics England" | Encounter Type | ||
cancer_analysis | tumour_clinical_sample_time | Encounter Date | |
"Genomics England" | Encounter Type | ||
cancer_register_nhsd | event_date | Encounter Date | |
"National Cancer Registration" | Encounter Type | ||
rare_diseases_participant_phenotype | phenotype_report_date | Encounter Date | |
"Genomics England" | Encounter Type | ||
rare_diseases_imaging | date | Encounter Date | |
"Genomics England" | Encounter Type | ||
ecds | arrival_date arrival_time |
Encounter Date | |
departure_date departure_time |
Encounter Date | ||
"Emergency" | Encounter Type | ||
hes_apc | admidate | Encounter Date | |
"Inpatient" | Encounter Type | ||
hes_op | "Outpatient" | Encounter Type | |
apptdate | Encounter Date | ||
did | "Diagnostic Imaging" | Encounter Type | |
did_date3 | Encounter Date | ||
mhmd_v4_event mhldds_event |
mhd_eventdate | Encounter Date | |
"Mental Health Services" | Encounter Type | ||
mhsds_medical_history_previous_diagnosis mhsds_primary_diagnosis mhsds_secondary_diagnosis |
diagdate | Encounter Date | |
"Mental Health Services" | Encounter Type | ||
mhsds_provisional_diagnosis | provdiagdate | Encounter Date | |
"Mental Health Services" | Encounter Type | ||
mhsds_indirect_activity | indirectactdate | Encounter Date | |
"Mental Health Services" | Encounter Type | ||
mhsds_care_activity | carecontdate | Encounter Date | |
"Mental Health Services" | Encounter Type | ||
mortality | event_date | Encounter Date | |
"Office of National Statistics" | Encounter Type | ||
rtds | "Radiotherapy" | Encounter Type | |
decisiontotreatdate | Encounter Date | For diagnosis codes | |
rtds | "Radiotherapy" | Encounter Type | |
proceduredate | Encounter Date | ||
sact | date_decision_to_treat | Encounter Date | For diagnosis, morphology and staging codes |
"Chemotherapy" | Encounter Type | ||
sact | administration_date | Encounter Date | |
"Chemotherapy" | Encounter Type |
-
ICD-10 codes are normalised so that they match the ICD-10 reference data. This includes the removal of any non-numeric characters other than the first character, and inserting a dot at the fourth position (for example, a coded value of "E149D" is normalised to "E14.9"). The original values can be obtained via the "Source Code" column. ↩↩↩↩↩↩↩↩↩↩↩↩↩↩↩