Skip to content

Upcoming

We are actively developing more data to augment AggV3. In future, we hope to release:

  • Population structure and relatedness, and HQSNPs. While we provide a sample list with relevant participant information, we are also recalculating the Principal Components for the participants, along with inferred ancestry assignment. These will initially be based of overlapping High Quality SNPs derived from AggV2 whereas subsequent releases will use HQSNPs based of AggV3 itself.
    • Status: Active development and testing.
  • Mendellian inconsistencies and UPD cases.
  • Hardy-Weinberg equilibria.
  • Allele frequencies per source programme and inferred assigned superpopulation.
  • SiteQC FILTERs along with additional features which aid interpretation of the Genomics England siteQC metrics files.

Change log

23-01-2026: Beta release

We have released the DRAGEN 3.7.8 and AggV3 to a subset of researchers within the community for initial testing and feedback. Within this package the following datasets were provided:

  • DRAGEN 3.7.8 variant calls (On S3 accessible through CloudOS).
  • DRAGEN 3.7.8 CRAMs (On SequenceStore accessible through CloudOS).
  • AggV3 (On S3 accessible through CloudOS)
    • Main delivery package provided by Illumina.
      • Multiallelic and biallelic msVCFs and PGENs.
      • Machine-Learning Recalibrated single sample gVCFs used as an input for AggV3.
    • Auxillary data produced by Genomics England.
      • SiteQC metrics. The primary delivery of AggV3 does not hold many site-level metrics, so Genomics England provide an additional fileset containing site-level metrics (e.g. MedianDP, MedianGQ, AB Ratio) that can be easily queried. This initial release does not contain any FILTER's yet, but will be provided in a subsequent release while we analyse the findings of the current site-level metrics.
      • Sample list complete with related identifiers and sample source program amongst other aspects.
      • SampleQC metrics for each sample summarised in single table.
      • Functional annotation VCFs per subshard.

Known bugs or features

We have not committed to resolving all of these bugs, but it may be useful to know about them for your analysis.

  • Functional annotation. Some annotations have been defined as Type=String where they should be Type=Float. This occurs due to a limitation in VEP, which assigns all custom annotations as Type=String by default. This should be taken into consideration when performing filtering through bcftools +split-vep.