Fig. 1

a Maximum likelihood phylogeny of the 12 almost complete reconstructed genomes from Caspian, Aegean, Ionian Seas and MALASPINA datasets in the SAR202 cluster together with the single cell amplified genomes of the SAR202 cluster. The tree was made using a concatenate of 50 conserved proteins. Genomes of the class Dehalococcoidia were used to root the tree. The reconstructed genomes from this study are highlighted in red. The subcluster designation for genomes containing 16S rRNA is shown in blue (SC III and SC V). Bootstrap values (%) are indicated at the base of each node. Legends for lifestyle hints are on top left. b Metagenomic recruitment of almost complete reconstructed genomes together with available single cell genomes of SAR202 cluster in different deep datasets from brackish and marine environments. Brackish datasets include two aphotic depths of the Caspian Sea. Marine datasets include Aegean, Ionian, and Marmara Sea deep datasets, and Puerto Rico Trench deep dataset, together with three of the deep MALASPINA datasets also used for assembly. The depth of samples for each dataset is mentioned inside parenthesis