Fig. 1

Diversity of prophages assessed by DNA similarity matrix analysis. a Overview dot plot of 243 prophage sequences arranged in groups, superclusters and clusters. Dot plot compared sorted and merged sequences of prophages (combined length of 9.5 megabase) on the x-axis, and the same collection of sequences on the y-axis of a plot. When the DNA residues of both sequences match at the same location on the plot, a dot is drawn at the corresponding position. Once the dots have been plotted, they will combine to form lines, and dense groups of lines will form black squares that correspond to clusters of similar genomes. The bigger the cluster, the higher the prevalence of prophages belonging to that cluster. The main diagonal represents the sequence’s alignment with itself; lines off the main diagonal but around it represent similar patterns within the closely related phage genomes. Signals found more far from the diagonal represents similar patterns within more distantly related phage genomes. Both halves of the graph created by the diagonal provide the identical information but we decided to keep them both for plot clarity. Enlarged sections for phages assigned to supercluster SuMu-like (b), Mu- and B3-like (c), BcepMu-like (d), MHaA1- and HP1-like (e), and lambda-like supercluster (f, g) are shown. Clusters (numbers 1–36) were highlighted with black frames. Superclusters and main groups are labeled and indicated by black or gray stripes. DNA coordinates for merged sequence are given in plots corners