Population genomics and antimicrobial resistance in Corynebacterium diphtheriae

Background Corynebacterium diphtheriae, the agent of diphtheria, is a genetically diverse bacterial species. Although antimicrobial resistance has emerged against several drugs including first-line penicillin, the genomic determinants and population dynamics of resistance are largely unknown for this neglected human pathogen. Methods Here, we analyzed the associations of antimicrobial susceptibility phenotypes, diphtheria toxin production, and genomic features in C. diphtheriae. We used 247 strains collected over several decades in multiple world regions, including the 163 clinical isolates collected prospectively from 2008 to 2017 in France mainland and overseas territories. Results Phylogenetic analysis revealed multiple deep-branching sublineages, grouped into a Mitis lineage strongly associated with diphtheria toxin production and a largely toxin gene-negative Gravis lineage with few toxin-producing isolates including the 1990s ex-Soviet Union outbreak strain. The distribution of susceptibility phenotypes allowed proposing ecological cutoffs for most of the 19 agents tested, thereby defining acquired antimicrobial resistance. Penicillin resistance was found in 17.2% of prospective isolates. Seventeen (10.4%) prospective isolates were multidrug-resistant (≥ 3 antimicrobial categories), including four isolates resistant to penicillin and macrolides. Homologous recombination was frequent (r/m = 5), and horizontal gene transfer contributed to the emergence of antimicrobial resistance in multiple sublineages. Genome-wide association mapping uncovered genetic factors of resistance, including an accessory penicillin-binding protein (PBP2m) located in diverse genomic contexts. Gene pbp2m is widespread in other Corynebacterium species, and its expression in C. glutamicum demonstrated its effect against several beta-lactams. A novel 73-kb C. diphtheriae multiresistance plasmid was discovered. Conclusions This work uncovers the dynamics of antimicrobial resistance in C. diphtheriae in the context of phylogenetic structure, biovar, and diphtheria toxin production and provides a blueprint to analyze re-emerging diphtheria. Supplementary information Supplementary information accompanies this paper at 10.1186/s13073-020-00805-7.


Fig S1. Construction strategy of plasmid pTGR5_pbp2m
The pbp2m gene was PCR amplified and combined with plasmid pTGR5 using Gibson assembly as indicated.

Fig S2. Biovar and tox status of the three strain subsets
In the upper panel, the numbers of strains are given separately for the three subsets of strains; the recent clinical isolates one (right hand side) is broken down by individual year. Colors correspond to biovars (see key) and shaded areas denote tox-positive isolates. In the lower panel, the percentage of tox-positive strains (red bars) and of tox-positive or tox-negative strains per biovar (see key), are given for the entire dataset and for the three subsets separately; shaded sectors correspond to tox-positive strains within each biovar.

Uncorrected phylogenetic tree
Recombinationpurged phylogenetic tree

Fig S3. Tanglegram
Tanglegram comparing the uncorrected phylogenetic tree (obtained with PhyML on the core genome alignment; left) and the recombination-purged tree obtained using ClonalFrameML (right). The topologies were identical, but the branch lengths were shorter after purging recombined regions (note that the scale of both trees is different). Fig S4. Phylogeny with the type strains of C. belfantii and C. rouxii. The phylogeny was obtained using PhyML from the core genome alignment obtained using ROARY. This tree indicated the root of the C. diphtheriae tree to be located on the branch leading to the two early-diverging strains CIP107521 and CIP107534. The phylogenetic tree and outer information correspond to those in Figure 2, with the addition of isolates names, geographic origins and year of isolation. The phylogenetic tree was obtained using IQtree version 2 from the core genome alignment produced using ROARY. Bootstrap values were obtained using IQtree bootstrap resampling option based on 58 replicates. Only bootstrap percentage values between 50 and 100 are represented at the nodes (see key). The circles around the tree display the dataset (internal circle), tox gene presence and toxin production (second circle), biovar (third circle) and the presence of 8 selected antimicrobial resistance genes (see names besides each circle).

Fig S7. Genomic difference between reference strains PW8 (biovar Mitis) and NCTC13129 (Gravis)
The genomic region of approx. 10 kb inserted in biovar Gravis strain NCTC13129 includes genes DIP0351, DIP0354 and spuA (DIP0357); these three accessory genes are strongly associated with biovar Gravis, as is the SNP at position 324,487.    The correlation matrix between antimicrobial resistance genotype and phenotype is based on the correlation for binary variables (in the case of resistance genes: 1, presence; 0, absence; in the case of antimicrobial drugs: 1, resistant/intermediate; 0, susceptible) using the 'corr.test' function (Pearson method, which for a pair of binary variables equates to the Phi coefficient) from the 'corrplot' R package. Significant correlations were visualized utilizing the 'corrplot' function from the same package. Blank squares represent correlations without statistical significance (p > 0.05). Positive correlation is depicted by blue circles, whereas red circles represent significant negative correlation. The size and strength of color represent the numerical value of the Phi correlation coefficient. Black rectangles group genes commonly found together in the same strain. Genes cmx and cmlA5 are known to be associated with chloramphenicol resistance, which was not tested here.    (DIP1497), pbp4 (DIP2005) and pbp4b (RS14485 = DIP0637). Pbp2m was not analyzed for amino-acid changes associated with penicillin resistance, as it corresponds to an accessory PBP.

Fig S14. Genetic context of the pbp2m gene in Corynebacterium
The genomic context of pbp2m in C. diphtheriae and other Corynebacterium strains that possess this gene is given for representative genomes of the diversity that was found. Genes pbp2m, blaB and lysR are represented with a dark red background; these three genes were always associated and constitute the pbp-containing unit (PCU). Gene ermX is in yellow. A putative helicase often associated with the PCU is represented in pink; a relaxase gene is shaded in green. Black arrows represent insertion sequence genes. Six groups were defined based on conserved features, as indicated. Dark grey parallelepiped joining different genomes represent homology levels, as indicated in the gradient key. Strains of the present study with identical structures as those represented are indicated in parentheses below the strain name of the representative genome. The scale bar represents 12 kb.