Fig. 1: VgrG identification workflow and an unrooted phylogenetic tree of VgrGs for the demarcation of T6SS subtypes.

a The workflow for the identification of valid VgrGs from 133,722 publicly available bacterial genomes. The 872 VgrGs available from the established T6SS database SecReT6 (red) and 472 putative Afp8 proteins, encoding VgrG domains, available from the eCIS database dbeCIS (green) were used as positive and negative datasets respectively for the selection of the empirical criteria for large-scale VgrG screening. b The 872 VgrGs available from the SecReT6 database with predefined subtype information are indicated by colored stars (key). VgrGs from subtypes i4a and i5 were mixed within the same clade in the tree, but these two subtypes were indeed closely related in the previous study27. The known type iii T6SS clade, derived mostly from Bacteroidetes, is highlighted with red shadow.