Extended Data Fig. 7: k-mer-based host predictions for Chesapeake Bay viral populations assembled from shotgun metagenomics sequence data.
From: Interaction dynamics and virus–host range for estuarine actinophages captured by epicPCR

Half of all assembled viral populations > 5kb had a significant top hit to a putative host in the host database (see Materials and Methods; upper panel left). Actinobacteria were overrepresented as putative hosts for Chesapeake Bay viral contigs relative to all Chesapeake Bay metagenome-assembled genomes (MAGs; upper panel right). If the composition of predicted top hosts and the abundance of those in database were very similar, it would suggest that the probability of being a predicted host would scale with the abundance in the database, potentially creating false-positive associations. However, the enrichment of MAGs in the top host predictions (upper panel middle) compared to the number of MAGs in the database (lower panel left) and the enrichment of Actinobacteria within MAGs predicted as hosts (upper panel right) compared to the composition of the MAG dataset (lower panel right) is consistent with substantial viral pressure on Actinobacteria populations in this environment.