Fig. 2From: Phylogenetically typing bacterial strains from partial SNP genotypes observed from direct sequencing of clinical specimen metagenomic dataThe effect of increasing SNP loci on genetic distance estimation. A pairwise distance matrix between 255 E. coli genomes and based upon approximately 225,000 SNPs was calculated and then correlated to matrices generated using fewer SNPs. SNP subsampling was performed 100 times at each level. Each subsampled matrix was then converted into a multi-FASTA and a distance matrix was calculated with mothur [33]. Distance matrices were compared with the Mantel test function in mothur and the Pearson correlation value was calculated and plotted for each subsampling level. The mean and standard deviation for all iterations were calculated and plottedBack to article page