Skip to main content

Table 1 Summary of ReporTree input types and respective clustering options, with indication of the main outputs provided by this tool

From: ReporTree: a surveillance-oriented tool to strengthen the linkage between pathogen genetic clusters and epidemiological data

Inputs Metadata and (phylo)genetic data

Clustering options

Main outputs

- Multiple sequence alignment (e.g., core SNP alignment)

- SNP/allele matrix (e.g., derived from cg/wgMLST analysis)

- List of mutations or VCFs

- Pairwise distance matrix (only for HC)

Minimum spanning tree (using GrapeTree)

- Genetic clusters at any (or all) possible distance threshold(s) (partitions table)

- Updated metadata table with clustering information (and nomenclature)

- Summary reports with the statistics/trends for the derived genetic clusters

- Nomenclature history (record of changes in cluster composition and codes between runs)

- Summary reports and in-depth cluster analysis for samples of interest

- Count/frequency matrices for the derived genetic clusters or for any other indicated grouping variable

- Regions of cluster stability

- Newick tree (when applicable)

Hierarchical clustering (using several methods, such as single-linkage)

Newick tree (e.g., SNP-scaled tree or dendrogram)

Distance between leaves and root or between tree nodes (using TreeCluster)