Skip to main content
Fig. 3 | Genome Medicine

Fig. 3

From: ReporTree: a surveillance-oriented tool to strengthen the linkage between pathogen genetic clusters and epidemiological data

Fig. 3

Results of ReporTree benchmarking of the alignment-based core SNP workflow using a multi-sequence alignment of 1788 M. tuberculosis samples and 88,562 informative nucleotide positions. A ReporTree running times for the 10 replicates of each sample subset with a site inclusion of 1.0 (left) and 0.95 (right), where the flag “all” indicates subsets for which ReporTree obtained clusters at all possible thresholds, the flag “single_thr” indicates subsets for which ReporTree obtained clusters at potential “transmission chain” level (12 SNP differences), and the flag “stability” indicates subsets for which ReporTree obtained clusters at all possible thresholds but only generated reports for those corresponding to stability regions. B ReporTree running times according to the number of variant sites obtained after alignment cleaning and that were used for clustering. Technical notes: 1. The “site-inclusion” argument defines informative nucleotide sites to be kept in the alignment based on the minimum proportion of samples per site without missing data (e.g., 1.0 reflects a “true” core alignment with all variant sites having exclusively ATCG, and 0.95 reflects a core alignment tolerating 5% of undefined nucleotides per site). 2. The M. tuberculosis dataset used in this benchmarking is described at [42]

Back to article page