Fig. 1: Phylogenetic tree of ST95.

a Maximum likelihood tree of 612 ST95 strains constructed using 20,607 recombination-free core SNPs with 1000 bootstrap and rooted using ED1a as an outgroup. The eight ST95 clades are coloured for clarification. Overall, the mean pairwise nucleotide divergence between genomes in the complete dataset was 1.5% (range 0–3.7%), and between genomes within each clade was 0.9% (range 0.52–1.67%). Clades 1 and 2 exhibited the greatest divergence, with a mean pairwise nucleotide divergence to genomes from other clades of 3.09% and 2.37%, respectively. The remaining clades were more closely related, with a mean nucleotide divergence of 1.6% (range 1.47–1.7%). The serotype of each strain is indicated with respect to capsule (K), O antigen (O) and flagellar (H), with the major serotype/s noted. The mapping of independent mutations in genes encoding proteins required for cellulose biosynthesis (BcsQABC) and modification (BcsEG) is also shown. A summary of the predicted phenotype of cellulose attenuation (-ve) is indicated. b Truncated mutations in cellulose biosynthesis and pEtN modification genes in ST95.