File Manifest¶
The root file path for all aggV2 data is:
/gel_data_resources/main_programme/aggregation/aggregate_gVCF_strelka/aggV2/
Add this to the extended file paths in the table to generate the full file path.
Files and descriptions | Extended file path |
---|---|
Aggregated genomic data | genomic_data/gel_mainProgramme_aggV2_ |
Aggregated functional annotation data using VEP 98 | functional_annotation/VEP/gel_mainProgramme_aggV2_ |
Aggregated functional annotation data using VEP 99 | functional_annotation/VEP_99/gel_mainProgramme_aggV2_ |
Test data. 5 chunks of 1000 variants each by 78,195 samples. Useful for testing scripts and workflows. Index files also present. | additional_data/test_data/gel_mainProgramme_aggV2_ |
Sample QC statistics. A tab-delimited version of the aggregate_gvcf_sample_stats table in LabKey. | additional_data/aggregate_gvcf_sample_stats/aggregate_gvcf_sample_stats_v10_78195.tsv |
Chunk names. A seven column tab-delimited file of chunk names in aggV2 with full file paths to genotype and functional annotation VCFs. 0-indexed BED format. | additional_data/chunk_names/aggV2_chunk_names.bed |
Chunk names. A seven column tab-delimited file of chunk names in aggV2 with full file paths to genotype and functional annotation VCFs. | additional_data/chunk_names/aggV2_chunk_names.tsv |
aggV2 sample list. All sample IDs in aggV2. | additional_data/sample_list/aggV2_sampleIds_mpv10_78195.tsv |
XX female participant list | additional_data/sample_sex/xx_females_illumina_ploidy_samples_40653.tsv |
XY male participant list | additional_data/sample_sex/xy_males_illumina_ploidy_samples_35822.tsv |
High LD exclusion regions | additional_data/PCs_relatedness/ MichiganLD_liftover_exclude_regions.txt |
High confidence independent (MAF > 0.05) SNP binary files | additional_data/HQ_SNPs/GELautosomes_LD_pruned_1kgp3Intersect_maf0.05_mpv10.* |
High confidence independent (MAF > 0.01) SNP binary files | additional_data/HQ_SNPs/MAF1/GELautosomes_LD_pruned_1kgp3Intersect_maf0.01_mpv10.* |
PCs1-50 across all aggV2 participants | additional_data/PCs_relatedness/PCA/ GEL_aggV2_MAF5_mp10.eigenvec |
Eigenvalues for unrelated aggV2 participants | additional_data/PCs_relatedness/PCA/GEL_aggV2_MAF5_mp10.eigenval |
Proportion of variance explained for PCs on unrelated aggv2 participants | additional_data/PCs_relatedness/PCA/GEL_aggV2_MAF5_mp10.propvar |
Pairwise kinship estimates for related individuals (threshold > 0.0442) | additional_data/PCs_relatedness/relatedness/GEL_aggV2_MAF5_mp10_0.0442.kin0 |
Kinship matrix for all individuals in aggV2 (stored in triangle, binary format) | additional_data/PCs_relatedness/relatedness/GEL_aggV2_MAF5_mp10.king.bin |
List of related sample platekeys (threshold > 0.0442) | additional_data/PCs_relatedness/relatedness/GEL_aggV2_MAF5_mp10.king.cutoff.related.id |
List of unrelated sample platekeys (threshold < 0.0442) | additional_data/PCs_relatedness/relatedness/GEL_aggV2_MAF5_mp10.king.cutoff.unrelated.id |
All platekeys assessed for relatedness | additional_data/PCs_relatedness/relatedness/GEL_aggV2_MAF5_mp10.king.id |
Eigenvalues from PCA on 1KGP3 unrelated individuals using aggV2 HQ SNPs | additional_data/ancestry/1KGP3_PCs/1KGP3_MAF5.eigenval |
Eigenvectors from PCA on 1KGP3 unrelated individuals 1KGP3 using aggV2 HQ SNPs | additional_data/ancestry/1KGP3_PCs/1KGP3_MAF5.eigenvec |
PC loadings from PCA on 1KGP3 unrelated individuals 1KGP3 using aggV2 HQ SNPs | additional_data/ancestry/1KGP3_PCs/1KGP3_MAF5.pcl |
Eigenvectors for projection of aggV2 samples into the unrelated individuals 1KGP3 PC loadings using aggV2 HQ SNPs | additional_data/ancestry/1KGP3_projection_GEL/GEL_aggV2_proj_on_1KGP3_MAF5_mp10.eigenvec |
Genetically inferred ancestry probabilities based on super-populations from the 1KGP3 | additional_data/ancestry/MAF5_superPop_predicted_ancestries.tsv |
Genetically inferred ancestry probabilities based on sub-populations from the 1KGP3 | additional_data/ancestry/MAF1/MAF1_actg_filtered_subPop_predicted_ancestries.tsv |
The VEP severity scale used in the bcftools +split-vep plugin. | additional_data/VEP_severity_scale_2020.txt |