Skip to main content
Figure 1 | Genome Medicine

Figure 1

From: ClinSeK: a targeted variant characterization framework for clinical sequencing

Figure 1

Schematic overview of ClinSeK. (A) The four major steps of the ClinSeK workflow for analyzing single nucleotide variants (SNVs) and insertions and deletions (indels) from DNA-sequencing data. (B) Illustration of k-mer screening, targeted alignment and variant calling. Sequencing reads (blue arrows) in raw FASTQ files are screened for presence of k-mers created from target sites of interest (dark, vertical dashed lines), which are predefined based on variant databases such as ClinVar and COSMIC. Those that do not contain any target k-mers (grey arrows) are discarded. Those associated with a target site (red vertical bar) are aligned against corresponding local reference sequences (grey horizontal bars) with potential variants (red dots) identified. Reads were realigned with mates (arrow in opposite directions) and against paralogous sites (green vertical bars) from other chromosomes. Variants are finally called from reads of high mapping quality (dark blue arrows). (C) Illustration of ClinSeK targeted breakpoint analysis. DNA or RNA sequencing reads are screened for presence of k-mers in the reference and in the variant alleles near the breakpoints or fusion junctions. Those that do not contain any target k-mers are discarded. The remaining ones are preferentially aligned to the wild-type reference (orange arrows) and to the fusion breakpoint (magenta bar) sequence (red arrows) and are counted and compared. (D) ClinSeK output. Reads and their alignments at the target sites are output in BAM files. Variants are output in VCF format and are further included in the clinical report.

Back to article page