Figure 3

Comparative genomic analysis and mapping of ST306 and ST615 clonal complexes across 5 continents. (A) Maximum likelihood phylogenetic trees showing the geographical clustering and relationship between ST306 and ST615. Sequence cluster (SC) represents a collection of isolates from the same clade, but which may be of different serotypes and STs. The SCs were identified using the sequence clustering software BAPS. (B) Pneumolysin sequence alignments. Amino acid sequence alignments for ST306 and ST615 pneumolysin compared with D39 pneumolysin sequence. Left panel: nucleotides, Right panel: amino acids. (C) Within ST nucleotide sequence diversity estimated by pairwise comparison of the genomes from each ST to determine number of single nucleotide polymorphisms (SNP) between pairs of isolates. The 95% confidence intervals for the SNP diversity plots are: ST615 (mean: 2883.60, lower: 1.00 and upper: 5445.55) and ST306 (mean: 26.96, lower: 7.38 and upper: 61.00, Student t-test, p = 0.0014) (D) Phylogeny/dendrogram of all serotype 1 Ply alleles (all STs with D39 and TIGR4). A total of 488 clinical isolates were used for this analysis. For comparisons, Ply alleles of TIGR4 (serotype 4) and D39 (serotype 2) laboratory strains are also included. Different colours show STs associated with each serotype 1 Ply allele. The specific amino acids that distinguish the Ply alleles are shown in the table below the Ply phylogeny. The different serotype 1 Ply alleles were arbitrarily labelled as PLY-S1x where the suffix ‘x’ corresponds to letters A to E for the five alleles identified.