Skip to content

Case studies

In some cases, when working with very rare disease, it is necessary to study very small groups of individuals or a single family, known as Case Studies. If you are carrying out a Case Study, there are some things you ought to consider.

Clinical Collaboration requests

If you wish to do a case study that would involve publishing potentially identifying individual level data, you should raise a Clinical Collaboration Request to contact and involve the Recruiting Clinician, and gain additional consent from that participant to publish their data.

For PhD and MSc projects

Since small scale studies of small groups or even single families are ideally suited to short-term projects like student projects, we have specific guidelines for these kinds of projects. If you have novel data, you can make a Clinical Collaboration Request and potentially gain consent to publish your data, although this is not guaranteed.

We will not process any requests where there are no new results and the request is only for consent to publish. If you do not obtain any novel results or do not get consent to publish we recommend redacting any individual-level data from your published thesis/dissertation and show the data to your supervisors and examiners within the RE. We can help you to add data to shared folders that will be accessible to others.

Please contact the Airlock Manager at Peter.O'Donovan@genomicsengland.co.uk if you need help with doing this.

Individual level data

If it is crucial for your project to export individual level data, you can raise a request that includes data at an individual level and it will be reviewed by the Airlock Committee and by the Rare Disease team to assess whether the data can be considered for export. Any data like this should not be specific enough to identify a group of fewer than five people in our dataset; you should include a description of the steps you have taken to ensure this.

If your analysis includes data for five or fewer participants or includes a combination of phenotypes that applies to five or fewer participants, this may be deemed Potentially Identifiable Data (PID) which we would not allow out of the Research Environment. Such requests will take time to review and may come back with notes or suggestions before they can be approved, so we would therefore advise you to allow extra time for your request.

To make your data less likely to be identifiable you can remove any unneccesary fields, e.g. pedigree details, gender, year of birth (if a range of years would also be acceptable). You could also redesign your analysis so that the conditions for your research apply to more than five participants within our dataset. If your request includes phenotypes linked to specific families, we may ask you to re-write this as counts of each phenotype in the cohort without linking to specific families and masking counts of participants where these are five or fewer.

Data tables

Wherever possible, you should export data in the form of a count table, with all counts lower than five masked (i.e. written as "<5"). For example this table:

Value Count
Male
Female <5
Short stature 6
Shortness of breath 8

Is far preferable to the table below:

Gender Phenotype
Male Short stature; shortness of breath
Male Short Stature
Male Shortness of breath
Female Short Stature
Male Short stature; shortness of breath
Male Short stature; shortness of breath
Female Short stature; shortness of breath
Male Shortness of breath
Female Shortness of breath
Male Shortness of breath