Skip to content

Pathology synoptic reports dataset

Synoptic reports dataset consolidates the efforts of a team of pathologists manually extracting medical entities from free-text pathology reports (collected by and received from the National Disease Registration Service) into versioned, structured document following the guidelines set by the Royal College of Pathologists (RCPath). These reports use a synoptic proforma, meaning they present pathology findings in a standardised checklist-style format rather than as free-text descriptions.

Structure

The dataset covers 8370 participants with a wide variety of cancers, including good coverage of colorectal and breast cancer cohorts.

Examples of what can be found in the synoptic reports include biomarkers, invasion, biomarkers, lymph node status/counts, staging, metastasis.

How to find the data

You can find the reports in the LabKey table synoptic_path_reports.

Each row in the dataset represents a section value taken from a report section for a given case and participant. Reports are versioned and carry a specific report name , reflecting the specimen or tissue type.

Example

Participant ID Case ID Report Name Report Version Section Path Label Value
P001 P001-1 Breast Excision (RCPath) 9.2.1702.89 Hidden Registry Fields ::HER 2 Status HER 2 Status Negative
P002 P002-1 Breast Core Biopsy (RC Path) 9.2108.0.58 Invasive Carcinoma Invasive Carcinoma Present