From: X-CAP improves pathogenicity prediction of stopgain variants
Feature type | Feature name | Description |
---|---|---|
Zygosity | violetzygosity | Binary variable distinguishing homozygous (and hemizygous) variants from heterozygous variants, inputed when known or predicted as a function of benign stopgain alleles at the same position in training set when unknown |
Gene/exon essentiality | violetoe | Number of benign stopgains in training set along gene divided by gnomAD’s expected number of loss-of-function variants |
 | RVIS | Measure of gene intolerance to functional variation |
 | OMIM gene map | Two non-exclusive, binary features indicating whether a recessive or dominant disease listed in the OMIM Gene Map is caused by mutations in this gene |
 | violetmonoclass pathogenic | Transcript or exon contains no benign variants and at least one pathogenic variant within training set |
 | violetcan be spliced out | Variant is skipped in at least one isoform of the gene |
Variant location | distance from CDS start/end | Number of coding nucleotides from CDS start and end |
 | relative CDS location | Distance from CDS start divided by CDS length |
 | violetdistance from exon start/end | Number of coding nucleotides from exon start and end |
 | violetrelative exon location | Distance from exon start divided by exon length |
 | violetexon length | Number of nucleotides in overlapped exon |
 | violetexon number | Index of the exon that the variant overlaps |
 | violet# transcript exons | Number of exons in overlapped transcript |
 | chromosome | Ternary variable indicating if the variable is located on an autosomal, X, or Y chromosome |
NMD | distance from last exon-exon junction | Number of coding nucleotides upstream from last exon-exon junction (negative if downstream of junction) |
 | violet% transcripts with NMD | Percentage of overlapped transcripts in which the variant is >50 bp upstream of the last exon-exon junction |
Stop codon read-through | violetstop codon | One-hot encoding of the new stop codon introduced by the stopgain |
Alternative translation reinitiation | violetdistance to next start codon | Number of base pairs between the variant and the next potential downstream start codon within the mRNA |
Cross-species conservation | phyloP | Base-pair conservation across vertebrates of upstream, downstream, and overlapped exon regions |
 | phastCons | Regional conservation across vertebrates of upstream, downstream, and overlapped exon regions |