Fig. 1: Highly represented Borg protein subfamilies.

A Number of proteins assigned to each highly multicopy protein subfamily for each of the 17 Borg genomes based on clustering reported by Schoelmerich et al. 2024. Each distinct Borg has a color-based name. B Maximum-likelihood phylogenetic tree showing intermixing of Borg proteins from four of these subfamilies, supporting their treatment as a single group (Group 1). A few sequences without subfamily clusters were assigned to subfamilies based on phylogeny. Notably, subfamily 196 places within a large clade of Methanoperedens sequences, suggesting that Borgs acquired these 10 sequences by recent lateral transfer from Methanoperedens. Representatives of all subfamilies have the expected structures and active site residues (including 0759). A few proteins were too poorly folded to enable confident analysis, but in all cases, the protein sequences phylogenetically placed in clusters of proteins with the expected structures.