Aggregate Variant Testing inputs file¶
The inputs.json file is a large file with many options that can be edited by the user. An example of a valid input file is shown below ("Example Inputs File"). That is followed by a breakdown ("Components of the Input File") by section, and an explanation for each input. Where appropriate, the documentation will refer to an external source (i.e. SAIGE-GENE options).
IMPORTANT NOTE
The inputs file is a JSON file, and therefore it does not support the use of comments. All comments in section "Components of the input file", introduced by an arrow "<-", are added only for explanatory purposes. Please make sure you don't include the comments in your actual input file.
Example inputs file¶
Example input file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
Components of the input file¶
There are five parts in the input file, that correspond to the five parts of the aggregate variant testing workflow. These are prefixed in the following manner:
- master_aggregate_variant_testing
- master_aggregate_variant_testing.part_1
- master_aggregate_variant_testing.part_2
- master_aggregate_variant_testing.part_3
- master_aggregate_variant_testing.part_4
Main workflow file¶
This section contains the following input variables with explanations:
Example main workflow file inputs
Workflow part 1 inputs - Translating inputs to chromosomal regions¶
This section contains the following input variables with explanations:
Part 1 inputs
Only one of "chromosomes_input_file
", "genes_input_file
", and "coordinates_input_file
" needs to be specified - for the file path, both relative and absolute paths are accepted.
Workflow part 2 inputs - Filtering¶
This section contains the following input variables with explanations:
Part 2 inputs
The workflow uses a python script to filter VEP functional annotation. This has consequences in how you specify filtering based on numbers vs strings.
Numbers are straightforward. The syntax is as follows: {"score": "gnomADg_AF", "condition": "<0.001"}. You can specify any valid comparison operator (==, !=, >=, <=) and filter numbers based on them.
Strings behave a little differently. The syntax is as follows: {"score": "LoF", "condition": "==\"HC\""}. Note the escaped quotes (\") around the string that you want to match. This is required due to the nature of the underlying python script. If they are omitted, you will get an error like the following: "Error: object 'HC' not found"
Workflow part 3 inputs - GRM creation¶
This section contains the following input variables with explanations:
Part 3 inputs
Workflow part 4 inputs - Aggregate Variant Tests with SAIGE-GENE¶
Part 4 inputs
Help and support¶
Please reach out via the Genomics England Service Desk for any issues related to running this script, including "AVT_workflow" in the title/description of your inquiry.