Extended Data Fig. 1: Pangenome accumulation curve of S. flexneri, S. sonnei, S. boydii and S. dysenteriae.

Each curve demonstrates the number of unique protein coding genes in the pangenome as a new genome is randomly added. Random permutation of the data were subsampled 100 times, in which genomes are subsampled without replacement at each iteration. The x-axis display the number of genomes and the y-axis shows the minimum and maximum range of unique gene count after each iteration in (A) and the mean value in (B).