From: Choice of transcripts and software has a large effect on variant annotation

Annotation examples. These screenshots from the ENSEMBL web browser [40] show two examples of variant annotation. (A) The variant NC_000011.9:g.57983194A>G (rs7103033) is relatively straightforward to annotate. It is the final base of the final exon in both transcripts at this position (a CCDS transcript (green) and a ‘merged’ ENSEMBL/Havana (GENCODE) transcript (gold)). The final codon has changed from TGA (stop codon) to TGG (tryptophan), so this is unambiguously a stop-loss variant. Using the ENSEMBL transcript set, both ANNOVAR and VEP correctly annotate this variant as stop-loss. (B) The variant NC_000006.11:g.30558477_30558478insA (rs72545970) is more difficult to annotate. It is the penultimate base of the exon for all but one of the transcripts shown. It is a single-base insertion, so could be annotated as a frameshift variant. Then again, it is an insertion in a stop codon, so could be a stop-loss variant. In fact, the final codon, TGA (stop codon), remains TGA with this variant (insertion of a single base A), so it is actually a synonymous variant. ANNOVAR annotates it as frameshift insertion and VEP as stop-loss, when using ENSEMBL transcripts. Each browser image consists of several tracks, which provide base-resolution information about the DNA sequence. Two tracks, ‘Sequence (+)’ and ‘Sequence (-)’, show the DNA sequence on the forward and reverse strands, respectively. Above these, a track shows start and stop codons, and above that, several tracks indicate the presence and structure of different transcripts (labelled as ‘Genes’ and ‘CCDS set’; transcripts are read from left to right). The ‘hollowed-out’ parts of transcripts indicate non-coding sequences. Below the DNA sequence, the track ‘Sequence variant’ shows known sequence variants from dbSNP [17] and the 1000 Genomes Project [18]. The ‘Variation Legend’ and ‘Gene Legend’ provide more information about features shown in different colours in the browser. CCDS, Consensus Coding Sequence; UTR, untranslated region.

