Skip to main content
Fig. 1 | Genome Medicine

Fig. 1

From: Best practices for variant calling in clinical sequencing

Fig. 1

Standard pipelines for NGS analysis. a Alignment and pre-processing of NGS data for an individual sample. Raw sequence data in FASTQ format are aligned to the reference sequence, with the resulting alignments typically stored in binary alignment/map (BAM) file format. Marking of duplicates in the BAM file is a critical step to account for duplicate reads of the same fragment. Base quality score recalibration (BQSR) and local realignment around indels are a computationally expensive step that may marginally improve variant calls. At the conclusion of this step, the file is ready for variant analysis. b Variant calling in NGS trio sequencing. In this common study design, variants are called jointly (simultaneously) in a proband and both parents, which enables the phasing of variants by parent of origin. The initial variant calls are typically filtered to remove a number of recurrent artifacts associated with short-read alignment and maybe visually confirmed by manual review of the sequence alignments. Orthogonal validation may be performed to confirm the variant and its segregation within the family. De novo alterations should be aggressively filtered to remove both artefactual calls in the proband (false positives) and inherited variants that were under-called in a parent (false negatives). In addition to manual inspection of alignments, most de novo mutations are independently verified by orthogonal validation techniques, such as Sanger sequencing. c Somatic variant calling in matched tumor-normal pairs. Identification of somatic alterations in tumors requires specialized variant callers which consider aligned data from the tumor and normal simultaneously. Candidate somatic variants are filtered and visually reviewed to remove common alignment artifacts as well as germline variants under-called in the normal sample. The resulting variants are typically validated by orthogonal approaches, which may require specialized approaches for low-frequency variants

Back to article page