Introduction

Intricate mosaics of inter-species interactions shape environmental microbiomes and their influence on ecosystem function1. In the marine environment, one of the most important inter-microbial relationships occurs among phytoplankton and bacteria2,3,4, whereby ecological partnerships played out at the microscale can govern ecosystem productivity and biogeochemistry5,6. Laboratory experiments have revealed that many phytoplankton-bacteria relationships are underpinned by exchanges of specific metabolites, resulting in mutualistic, commensal, or parasitic interactions7,8,9,10,11.

Chemical exchanges between phytoplankton and bacteria were initially thought to be based solely on the provision of phytoplankton-derived dissolved organic carbon (DOC) to bacterial associates, which could sometimes be reciprocated by the transfer of remineralized nutrients and vitamins back to the phytoplankton12. However, recent studies have started to uncover the tremendous complexity of these interactions and the diversity of chemical currencies exchanged in the establishment and maintenance of these partnerships8,9,13,14. Furthermore, many of these associations appear to be species15,16,17, or even strain-specific8,18, as well as very dynamic. Indeed, the nature of the interaction can shift (e.g., from mutualism to pathogenicity) according to environmental conditions19 or the growth stage of the organism20. These metabolite exchanges have been theorised to take place when bacteria are in close spatial proximity to a phytoplankton partner, within the phycosphere, which is the microenvironment directly surrounding phytoplankton cells where the concentration of exuded metabolites can be orders of magnitude higher than the surrounding seawater6,21,22.

Mutualistic phytoplankton-bacteria partnerships have been identified in several model systems, whereby interlaced metabolisms suggest high levels of specificity and interdependency8,9,15,23. However, while some marine bacteria show a high degree of specificity towards particular phytoplankton-derived metabolites, others are less selective and use a wider diversity of phytoplankton derived DOC24,25,26. This underscores the potential existence of distinct strategies among the diverse pool of marine bacteria that collectively obtain most of their carbon requirements from phytoplankton; specialists that establish close, and likely mutually beneficial relationships with very specific phytoplankton hosts, and generalists that more indiscriminately exploit phytoplankton derived DOC. An understanding of the nature and balance of these strategies is important because they potentially shape the diversity of the aquatic microbial communities that constitute the base of the marine food web.

To tackle these questions, we single cell isolated 15 discrete phytoplankton strains, spanning 12 genera, from the marine environment and subsequently measured the dynamics of their changing microbiome for 400 days. Combining 16S rRNA amplicon sequencing and metagenomic analysis, we identified bacteria that established sustained associations with one or two (specialists) or multiple (generalists) phytoplankton and define the genomic characteristics underpinning these different bacterial strategies for interacting with phytoplankton in the environment.

Results and discussion

Phytoplankton harbour unique microbiomes

Here we tracked the temporal dynamics of the bacterial assemblages associated with 15 newly isolated phytoplankton strains, over the course of 400 days and compared the genomic characteristics of the bacterial associates that established either specialist or generalist interactions with their hosts. Specifically, we single-cell isolated 12 diatoms (Cylindrotheca cf. closterium, Actinocyclus sp., Helicotheca cf. tamesis, Chaetoceros sp. (2 strains), Eucampia sp. (2 strains), Minutocellus sp., f_Cymatosiraceae, Leptocylindrus sp., Lauderia cf. annulate, and Minidiscus sp. and 3 green algae (Nanochlorum sp., and Pycnococcus cf. provasolii (2 strains)), from plankton net drops in February and April 2019 at a long-term coastal oceanographic site (Port Hacking, located ~5 km from the central-eastern coastline of Australia)27. Phytoplankton were taxonomically classified using both microscopy and marker genes (i.e., full length 18S rRNA, LSU and ITS; Supplementary Data 1). We characterized the initial prokaryotic assemblages present within the bulk seawater that the phytoplankton were isolated from, and then followed the dynamics of the microbial communities associated with the 15 phytoplankton cultures through time. We first sampled the microbiome of the newly established cultures 7 days after their isolation. This time point was the earliest we could sample to gain taxonomic and genomic insights into native phycospheres, while also leaving enough biomass for continuing the timeseries. We then characterised changes in the bacterial communities after 10, 20, 40, 200 and 400 days, using a microvolume DNA extraction and sequencing pipeline28 (Supplementary Fig. 1).

Each of the 15 phytoplankton strains harboured a microbiome that became significantly different from the plankton haul seawater as early as 7 days post-isolation (PERMANOVA, p < 0.05; Fig. 1a, b, Supplementary Table 1). Between days 7 to 20, the phytoplankton collected from the same plankton haul (February 2019 and April 2019) shared similar microbial communities, but these communities diverged significantly after 20 days (Supplementary Fig. 2). Concomitantly, the average richness of the bacterial communities was not statistically different during the first 20 days after isolation (with 110 ± 57.61 amplicon sequence variants (ASVs) on average per strain), but decreased significantly (39%) between days 20 and 40, to reach an average of only 48.07 ± 26.44 ASVs per phytoplankton strain by day 400 (Mann-Whitney (MW), p < 0.05; Bonferroni corrected; Fig. 1c, Supplementary Table 2). This pattern is indicative of a streamlining of the phytoplankton-associated bacterial communities towards a specialized assemblage within 20 days of isolation from the environment. The decrease in community richness was accompanied by a 48% decrease in average community dissimilarity calculated between the later time-points (days 40–200 vs days 200–400) pointing towards a progressive stabilisation of the bacterial community composition after 200 days (MW, p < 0.05; Bonferroni corrected; Fig. 1d, Supplementary Table 3). During this process, the relative abundance of Rhodobacterales and Flavobacteriales increased by 5-fold compared to the initial seawater communities (Supplementary Figs. 3, 4). The increased relative abundance of ASVs within these groups is notable given that members of the Rhodobacterales and Flavobacteriales are regularly among the most abundant bacteria in both laboratory microalgal cultures15,29 and natural phytoplankton blooms30,31,32. Importantly each phytoplankton isolate (even those belonging to the same genus) harboured a unique bacterial community after 400 days (PERMANOVA, p < 0.05; Supplementary Table 1). These results show that the most significant alterations to the phytoplankton microbiome occurred within 20–40 days following isolations and that unique communities stabilized after 200 days. These results contrast with a previous report of the temporal stability of bacterial communities associated with diatoms33. The mechanisms underpinning this disparity, and the loss of richness observed after 20 days in our study, were likely driven by the loss of transient (i.e., non-phytoplankton associated) bacteria, which were co-isolated with our phytoplankton either due to random proximity at the time of sampling, or infrequent interactions. While the impact of bottle effects cannot be disregarded in experiments of this type, we believe that the clear divergence of the bacterial communities associated with different phytoplankton hosts, points towards an ecological structuring driven by phytoplankton-bacteria interactions, rather than any laboratory artefact.

Fig. 1: Taxonomic composition of phytoplankton-associated bacterial communities through time.
Fig. 1: Taxonomic composition of phytoplankton-associated bacterial communities through time.
Full size image

Non-metric multidimensional scaling of bacterial communities associated with phytoplankton isolated from the (a) February plankton haul and (b) April plankton haul (arrows indicate the trajectory of the microbiome through time (days). The size and shading of the dots indicate the different time points with the smallest/lightest dots indicative of the early time point (day 7) and the largest/darkest dots indicative of the latest time point (day 400). c Average richness of the phytoplankton microbiome at the different timepoints (time 0 indicates the seawater richness). The lower letters on the boxplot indicate time points that are significantly different (MW, p < 0.05, Bonferroni-corrected, n = 45 for all time points except t = 0 for which n = 8). d Boxplot of the average Bray-Curtis dissimilarity of the microbial communities between time points. Asterix indicate the significance of the pairwise dissimilarity (MW, ****p < 0.001, **p < 0.05, Bonferroni-corrected, n = 90 for all the time points comparisons except 0_7 for which n = 54). The box plots represent the first quartile, median, third quartile, and minimum and maximum values (i.e., whiskers). Source data are provided as a Source Data file.

Identification of phytoplankton specialists and generalists

Despite the lack of community level taxonomic congruency between the phytoplankton-associated bacteria (Supplementary Fig. 3), some specific bacterial ASVs were identified in multiple phytoplankton isolates. To clearly define the taxonomy and abundance of these shared ASVs, we separated the phytoplankton-associated bacteria into multiple categories based on the specificity of their relationships and ability to sustain long-term association with a specific phytoplankton host. Given the stabilisation of the phytoplankton-associated bacteria after 200 days, we identified the ASVs present on days 200 and 400 (in at least 4 of the 6 replicates) as bacteria sustaining long-term association with phytoplankton hosts, which we classified as either: (i) specialist phytoplankton associates (hereafter called specialists) if they were present in only 1 or 2 of the 15 phytoplankton isolates (with a relative abundance (RA) > 0.001); or (ii) generalist phytoplankton associates (hereafter called generalists) if they were present in 3 or more phytoplankton isolates (RA > 0.001). The cut-off for these classifications was chosen based on the distribution of unique ASVs in the different phytoplankton cultures after 200 days, whereby 98.4% of the ASVs considered were present in only 1–2 phytoplankton (specialists). Conversely, only 1.6% of the ASVs were shared by 3 or more phytoplankton (generalists; Supplementary Fig. 5). The remaining ASVs were identified as bacteria that are not involved in sustained associations with phytoplankton and were classified as: (iii) transient associates (hereafter called transients), if they were only present in the earlier time points and either completely absent, or present in only 1–2 of the 6 replicates of days 200 and 400, or (iv) “initial seawater”, which included the ASVs that were only present in the bulk samples from the plankton tow (t0), but were subsequently absent in all of the phytoplankton microbiomes from day 7 onwards (Supplementary Data 2).

In line with the measurements of bacterial richness, more than half of the total recovered ASVs (1657 ASVs; 55% of the total) were classified as transients, while only 370 ASVs (8.1% of the total) were classified as initial seawater bacteria. The transients made up 44% of the community relative abundance in the initial plankton haul samples, but steadily declined over time to only 4.2% of the averaged community after 400 days (Mann Whitney (MW) test, p < 0.05; Bonferroni corrected Fig. 2a, b, Supplementary Data 3). This transient community mostly consisted of ASVs belonging to Enterobacterales (52%), Pseudomonadales (18%) and Flavobacteriales (11%) (Supplementary Data 2).

Fig. 2: The relative abundance, dynamic and taxonomy of the phytoplankton-specialists, generalists, and transient over time.
Fig. 2: The relative abundance, dynamic and taxonomy of the phytoplankton-specialists, generalists, and transient over time.
Full size image

a Total relative abundance of all generalists, specialists, and transients ASVs over time, calculation of the R and p values for each of the three linear regression lines based on all 15 algal isolates and replicates over time. b Average relative abundance of generalists, specialists and transients over time based on the 15 phytoplankton isolates and the seawater samples. Only ASVs that accounted for >0.1% relative abundance of at least one algal isolate at any given time were used for the analysis (Initial SW = initial seawater). c Taxonomic composition of the generalists, specialists and transients at the Class level. Numbers of ASVs for each Class are indicated in the pie charts. Others* within the transient class include: Acidimicrobiia (4), Bacteriovoracia (2), Graciibacteria (1), Lentisphaeria (3), Oligoflexia (1). Source data are provided as a Source Data file.

In contrast to the decline in abundance of transients over time, the total relative abundance of the generalists increased steadily over the 400 days of culture (2.13% increase; Fig. 2a, b, Supplementary Data 3). However, only 18 ASVs (0.7%) were identified as generalists across the 15 phytoplankton strains, with the majority of these (10 ASVs) belonging to the Enterobacterales, specifically the Vibrionaceae (7 ASVs) and Alteromonadaceae (3 ASVs) family, which were then followed by Rhodobacterales, specifically the Rhodobacteraceae family (4 ASVs, RA > 0.1%; Fig. 2c, Supplementary Data 2). One of the generalist ASVs (genus: Pseudoalteromonas) was shared among 6 of the phytoplankton strains, which spanned 4 different orders, while most other generalists were shared between three to four different phytoplankton strains.

The abundance of specialists increased by 10-fold over time (i.e., from 7.6% in the plankton haul to 76.8 % of the phytoplankton-associated communities by day 400) (MW, p < 0.05; Bonferroni corrected, Fig. 2b, Supplementary Data 3). A total of 947 specialist ASVs were identified, of which the majority were members of the Rhodobacterales and Enterobacterales orders, specifically Rhodobacteraceae (100 ASVs), Vibrionaceae (149 ASVs) and Alteromonadaceae (162 ASVs) (Fig. 2c, Supplementary Data 2). The observed taxonomic composition of specialists and generalists is consistent with previous studies, whereby members of the Rhodobacteraceae and Alteromonadaceae are the most frequently reported bacterial orders co-occurring and interacting with phytoplankton26,34,35, reiterating the importance of these taxa as key phytoplankton associates. Interestingly, a small portion of the specialists (~2%; 18 ASVs) were classified as Archaea, and were associated with 6 of the isolated diatoms, which aligns with results from environmental studies suggesting ecological interactions between phytoplankton and archaea30,36,37. While we cannot exclude the possibility of inter-bacterial interactions driving the structuring of the microbiome, the clear enrichment of phytoplankton-associated specialists overtime points towards the important role of bacterial interaction with their phytoplankton host. Moreover, the results observed here support previous hypotheses that phytoplankton-associated bacteria may purposefully colonize the phycospheres of their specific phytoplankton partners4,6,15,26,38.

Genomic characteristics of generalist, specialist, and transient reference genomes

To explore the functional traits discriminating the specialists, generalists and transients, we assembled bacterial genomes from metagenomic sequencing of a subset of the phytoplankton strains in our experiment (Helicotheca cf. tamesis, Actinocyclus sp., Pycnococcus cf. provasolii, Cylindrotheca cf. closterium), as well as from whole-genome sequencing of bacterial isolates cultivated from the phytoplankton cultures. Bacterial genomes were assigned to either generalist, specialist, or transient categories when their 16S rRNA gene displayed a 100% match to an ASV from one of these three categories. To minimize the probability of functional capability bias due to variable genome completion, we focussed our analysis solely on genomes with a completion > 90% and redundancy <5%39. This approach resulted in 33 high-quality genomes, of which 9 were generalists, 14 were specialists, and 10 were transients (Supplementary Data 4), with most of the bacteria belonging to four main families: Rhodobacteraceae (7), Flavobacteriaceae (3), Alteromonadaceae (8) and Vibrionaceae (3) (Fig. 3a).

Fig. 3: Generalists and specialists are enriched with functional traits involved in host interactions.
Fig. 3: Generalists and specialists are enriched with functional traits involved in host interactions.
Full size image

a Taxonomic composition of the assembled genomes used for this analysis aggregated at the Order level. b Percentage of MAGs encoding genes related to functional traits key for phytoplankton-bacteria interactions. Functional traits in bold were significantly different between the three categories (quorum sensing (QS), catabolism of phytoplankton-produced compounds (phyto compounds), energy metabolisms (Energy)). Significant differences were determined with a Fisher’s exact test, and the p values were FDR-corrected. Asterisks indicate which category was significantly enriched compared to transients. The bar plots represent the mean and the standard error (i.e., whiskers). c Pathways related to motility, vitamin biosynthesis and transport, transporters and the metabolisms of phytoplankton compounds were enriched in generalists and specialists compared to transients (p* = putative). Significant differences were determined with a Fisher’s exact test, and the p values were FDR-corrected. Source data are provided as a Source Data file.

Bacteria adapted to thrive in a nutrient-rich environment are usually characterized by high metabolic plasticity, so we hypothesized that phytoplankton associates would display similar genomic characteristics. Indeed, the genomes of the generalist and specialist phytoplankton-associates were larger than transient organisms. Specifically, the genome size of generalists was on average 18.2% larger than specialists (Supplementary Fig. 6) and 34.2% larger than transients (MW p < 0.05; Bonferroni corrected, Supplementary Fig. 6, Supplementary Table 4). These results suggest that generalists encode a more extensive genomic repertoire compared to bacteria belonging to the other categories, which is consistent with the potential to benefit from interactions with several different phytoplankton partners.

Members of the specialist and generalist categories also had a faster estimated doubling time, ~5 h and 7 h respectively40 compared to transients (>8 h; Supplementary Fig. 6). Large genome size and relatively fast doubling times are traits typically associated with copiotrophic bacteria41,42, which is also consistent with the predominance of copiotrophs within microenvironments with a higher concentration of dissolved organic matter (DOM), such as the phycosphere43.

Phytoplankton-associated bacteria have distinct functional traits compared to transients

While phylogeny was responsible for some of the functional distinction between reassembled genomes (Supplementary Fig. 7), we found that the genomes of phytoplankton-associates (i.e., generalists and specialists) were functionally distinct from transient bacteria based on the presence/absence of several KOs (PERMANOVA F = 2.93, P-adj < 0.05). Several functional pathways relevant to the interaction between phytoplankton and bacteria26,44, such as quorum sensing, cell attachment, catabolism of phytoplankton compounds, motility, chemotaxis and different suites of transporters, were all highly enriched in both of the phytoplankton-associates (generalist and specialist) compared to members of the transient category (Fisher’s Exact test, adjusted p < 0.05, FDR adjusted, Fig. 3b, Supplementary Table 5).

Compared to transients, phytoplankton associates were both significantly enriched in pathways related to the transport and biosynthesis of several B vitamins (Fisher’s Exact test, FDR adjusted p < 0.05; Supplementary Table 5). B-vitamins are important micronutrients and are essential co-factors for key enzymatic reactions. While we cannot confirm vitamin deficiencies for our phytoplankton isolates, it is well-established that many phytoplankton species are unable to synthesize certain vitamins de novo (e.g., B1, B7, B12)45 and instead rely on the biosynthetic capabilities of bacterial associates46. Hence vitamins have been identified as one of the crucial chemical currencies in phytoplankton-bacteria interactions9,47,48,49. Indeed, 90% of the generalists and 85% of specialists had the genomic capacity for cobalamin (B12) biosynthesis (Fisher’s exact test, FDR adjusted p < 0.05, Supplementary Data 5, 6), while only 30% of the transients encoded B12 biosynthesis genes. In addition, several generalists and specialists encoded genes for B1 (44%) transporters, while generalists encoded genes for B7 transporter (33%). In comparison none of the transients encoded genes for B1 or B7 transporters.

The genomes of specialists and generalists also encoded a significantly larger range of transporters compared to transients (Fisher’s Exact test, FDR adjusted p < 0.05; Fig. 3b, Supplementary Data 5, 6). For example, 57% of specialist and 55% of generalist genomes harboured pathways for general amino-acid transporters, the glucose mannose transport system and glycerol transporters (47% specialist, 33% generalist) (Fig. 3c), while these transporters were present in only 10% of transient genomes, or in the case of glycerol transporters, were completely absent (Fisher’s exact test, FDR adjusted p < 0.05, Fig. 3c, Supplementary Data 5, 6). Given that amino acids and these carbohydrates are common phytoplankton metabolites that are abundant in the phycosphere50, the higher presence of transporters for organic compounds is hence consistent with the potential ability of generalists and specialists to metabolise organic compounds exuded by phytoplankton50,51.

Generalists and specialists were enriched in genes involved in chemotaxis compared to transients (Fisher’s Exact test, FDR adjusted p < 0.05), with 58% of the generalists and 45% of the specialists exhibiting the genomic potential for chemotaxis compared to only 30% of the transients. Chemotaxis has long been predicted, and more recently experimentally demonstrated, to play a critical role in establishing and sustaining phytoplankton-bacteria interactions22,52,53.

Half of the specialist genomes encoded genes for the production of homoserine-lactone, compared to 33% of generalists and just 20% of the transients (Fisher’s Exact test, FDR adjusted p < 0.05). Homoserine-lactones are a class of signalling molecules involved in quorum sensing (QS), which can play a role in cell attachment31,38,54. Notably, Acyl-homoserine lactones (AHLs) have been recently linked to a decrease in bacterial motility and enhancement of attachment to diatoms38. Moreover, 64% of the specialists and 55% of the generalists encoded genes for adhesin and Tad operon related genes, compared to 10% of the transients (Fisher’s exact test, FDR adjusted p < 0.05, Fig. 3c). Elevated levels of homoserine lactone gene clusters (Supplementary Data 5, 6), as well as the adhesin and tad operons, indicate that specialists and generalists may be more likely to engage in prolonged interaction with their phytoplankton hosts, facilitated either by direct attachment or by remaining in tight proximity of their host.

Phytoplankton release a myriad of metabolites that can be utilised by marine bacteria, and accordingly both generalists and specialists were enriched in a suite of genes related to the catabolism of specific phytoplankton derived compounds. For example, generalists and specialists were enriched in transporters for putrescine (66% specialists, 22% generalists) and taurine, which are phytoplankton metabolites that are abundant in the phycosphere and are utilized by phytoplankton-associated bacteria as a source of energy55. Moreover, 64% of the specialists encoded hpsN, a key gene involved in the metabolism of the sulfur containing osmolyte 2,3-dihydroxypropane-1-sulfonate (DHPS), compared to only 10% of transients (Fisher’s exact test, FDR adjusted p < 0.05) and 33% of the generalists. DHPS is an abundant sulfur-containing metabolites released by diatoms56, and the catabolic pathway for its utilization underpins a highly sophisticated interaction with specialized bacteria9 (often from the Roseobacter clade57). Genes related to the catabolism of DMSP (dmdA, dmdB), another important phytoplankton-produced sulfur metabolite, were also present in more than 50% of the specialists compared to only 30% of the generalists and 10% of transients (Fisher’s exact test, FDR adjusted p < 0.05, Fig. 3c, Supplementary Data 5, 6). Cumulatively, these results suggest that phytoplankton associates, and more specifically specialists, can use very specific phytoplankton-derived substrates compared to transients.

Almost half of the specialists (42%) identified in this study harboured a Type VI secretion System (T6SS), compared to just 20% of generalists and 10% of transients. Symbionts and pathogens utilize secretion systems, such as the type III, type IV, and type VI (T3SS, T4SS, and T6SS, respectively), to directly transport DNA, proteins, or even DNA-protein complexes, directly from the bacterial cell into a target cell58,59,60. Importantly, both the effector T4SS and the T6SS can translocate molecules or DNA into either a target bacterial or eukaryotic cell, making them important mechanisms of cross-kingdom interactions in aquatic environments. T6SS is usually more abundant in particle-associated bacteria60 and while several studies on T6SS focus on its role in pathogenicity61, it is also used by symbionts to eliminate competitors62,63, or interact with their hosts64. While the specific role of T6SS within the phycosphere is not yet known, given its role in other symbiotic interactions, we postulate that phytoplankton specialists may utilize T6SS in two ways: to selectively outcompete other bacteria and/or to enhance chemical exchanges with their phytoplankton hosts.

Finally, relative to transients, the genomes of both generalists and specialists were enriched in biosynthetic gene clusters (BGCs) related to the production of secondary metabolites, such as NRPS (non-ribosomal peptide synthase), betalactone, RiPP-like metabolites (synthesized and post-translationally modified peptide) and terpene (Supplementary Data 7). Most of the generalists (88%) and specialists (78%) encoded for RiPP, compared to 50% of transients. Moreover, the genomes of 50% of the specialists and 44% of the generalists encoded for terpene biosynthetic clusters compared to just 20% of transients. Terpenes are a large group of molecules that include several phytohormones65,66, known to enhance the growth of phytoplankton8,67.

Generalists differ from specialists, pointing to different associations with phytoplankton

While phytoplankton associates exhibited clear differences in genomic characteristics relative to transients, our analysis also identified subtle differences in the genomic repertoire of generalists compared to specialists. For instance, we identified that motility genes, more specifically genes required for the biosynthesis of the flagella (e.g., flgA, flgL and fliL), were significantly enriched in generalists compared to specialists, with 89% of the generalists being putatively motile (and also chemotactic), compared to only 37% of the specialists (Fisher’s Exact test, FDR adjusted p < 0.05; Fig. 3b, c, Supplementary Table 5, 6). Motility, together with chemotaxis, is fundamental in establishing and sustaining phytoplankton-bacteria interactions22,26,53. Our results indicate that generalists have greater capacity than specialists to use motility and chemotaxis to navigate the chemical gradients associated with the phycosphere26,43. Specialists, in some cases, may maintain their associations with phytoplankton partners through cell attachment rather than motility.

Generalist genomes were also enriched in iron-related transporter pathways compared to specialists, and particularly genes related to the production of siderophores, with 55% of generalists encoding genes for the synthesis of siderophores compared to just 7% (1/14) of specialists (Fisher’s Exact test, FDR adjusted p < 0.05; Supplementary Table 6, Supplementary Data 8). While the reason that generalists displayed a greater capacity for producing siderophores than specialists is not immediately clear, iron is a major limiting factor controlling phytoplankton productivity and there is evidence that bacterial-produced siderophores can enhance the bioavailability of iron and hence enhance phytoplankton growth13.

Generalists also encoded the largest number of secondary metabolite clusters, with 23 clusters present in one or more genomes compared to 13 clusters for specialists (Supplementary Data 7). The most enriched BGCs in generalists were related to the production of T1PKS (polyketides) (55% of the genomes) and non-ribosomal peptide (NRPS) (66%), while none of the specialist genomes exhibited gene clusters for NRPS and only 21% had gene clusters for T1PKS. These secondary metabolites gene clusters are known to mediate antimicrobial activities68. The ability to produce and secrete antimicrobial compounds could help the generalists outcompete other bacteria in the arms-race for common substrates within the phycosphere.

In summary, the phytoplankton associates (specialists and generalists) exhibit specific genomic potential that facilitate long term interaction with phytoplankton, such as the ability to respond to chemical gradients (chemotaxis), to produce important chemical currencies (e.g., B-vitamins), to release plant growth promoting hormones (e.g., Terpene BGCs), and to attach to other cells. In addition, specialists are able to catabolise specific phytoplankton compounds (e.g., DHPS, DMSP), that differentiate them from other marine bacteria not associated with phytoplankton. In addition, specialists and generalists also differ to each other. Specifically, generalists can more promiscuously interact with several hosts, thanks to the ability to more freely move between hosts (motility), to chelate and make iron available to their hosts (siderophores) and to outcompete other bacteria cells through the release of antimicrobial compounds.

Generalists and specialists co-occur with their phytoplankton hosts in the environment

To determine the environmental relevance of the patterns identified in our laboratory cultures, we quantified the levels of co-occurrence between specialists, generalists and transients and their respective phytoplankton partners in the ocean. To achieve this, we used shotgun metagenomic data from three long-term oceanographic timeseries sites, which are part of the Australian National Reference Station network (NRS). The sites included Maria Island (MAI), Port Hacking (PHB), and North Stradbroke Island (NSI), which span the eastern coastline of Australia across 15 degrees of latitude27 (Fig. 4c). Samples included in our analysis were collected on a monthly basis over 5.5 years (2015–2020)27. We focused our analysis on the environmental distribution of two phytoplankton, Cylindrotheca cf. closterium and Pycnococcus cf. provasolii, based on the high quality of their MAGs (C > 70%, R < 10%) and investigated the levels of their environmental co-occurrence with their generalist, specialist and transient bacterial associate genomes (Fig. 4a, b). The other two phytoplankton MAGs displayed low completion (<50%) and high redundancy (>30%), making them unsuitable for this approach. As controls, we included four MAGs that matched (100%) bacterial ASVs belonging to the initial seawater bacteria.

Fig. 4: Co-occurrence of generalists, specialists and transients with their phytoplankton host in the environment.
Fig. 4: Co-occurrence of generalists, specialists and transients with their phytoplankton host in the environment.
Full size image

Correlation heatmap of the co-occurrence of (a) Cylindrotheca cf. closterium (b) and Pycnococcus cf. provasolii, as well as their specialists, generalists, transients, and initial seawater bacteria (SW) MAGs. Negative correlations were left blank. Stars indicate the statistical significance of the correlation (*p < 0.5, **p < 0.001, ***p < 0.0001). Correlations and their statistical significance were calculated in R using the Hmisc package. c Map of Australia with the location of the National Reference Stations used in this study (NSI North Stradbroke Island, PHB Port Hacking, MAI Maria Island). d Bar plots of the relative abundance of Cylindrotheca cf. closterium (black bar) at MAI between 2016–2020 and the relative abundance of its generalist and specialists. Source data are provided as a Source Data file.

The relative abundance of the two targeted phytoplankton in the examined environments was generally low, with the average RA of Cylindrotheca and Pycnococcus around 0.01% of the total community. The highest relative abundance of both species was recorded at MAI, reaching 0.02% for Cylindrotheca and 0.2% for Pycnococcus. Similarly to the phytoplankton, neither the specialists nor the generalists were abundant in seawater (0.004–0.05% RA), but their relative abundance was generally positively correlated with their phytoplankton host (Fig. 4, Supplementary Data 9). This was particularly evident at MAI where all generalist and specialist associates of Cylindrotheca (1 generalist, 7 specialist) and Pycnococcus (1 generalist, 2 specialist) were significantly correlated with the relative abundance of their hosts (Spearman’s correlation, p < 0.05, Fig. 4a, b, Supplementary Data 9). Only two of the transient bacteria associated with Cylindrotheca were also positively correlated with its relative abundance, while the control MAGs, which exhibited higher relative abundances compared to the specialists and generalists (up to 5% of the total community), were not correlated with the presence of the phytoplankton (except for a single case of a Pseudomonadales at PHB). Despite their sparsity in the environment, tight correlations between generalist and specialist phytoplankton associates with their phytoplankton hosts suggest that these interactions are not a culture artefact but are indeed ecologically relevant, further supporting the conclusions of our laboratory-based culture study.

Final remarks

Here we characterised the bacterial communities associated with 15 phytoplankton to identify ecological strategies among phytoplankton-associated bacteria. We found that phytoplankton microbiomes rapidly change after isolation before progressively stabilising after 200 days, when they became mostly composed of members of the Rhodobacterales, Pseudomonadales and Flavobacteriales. We identified a subset of bacteria that establish long-term associations with a specific phytoplankton host (specialists), while others associated with multiple phytoplankton hosts (generalists). These phytoplankton-associated bacteria were characterized by relatively fast growth rates, allowing them to swiftly respond to changes in their surroundings, such as newly released phytoplankton compounds. We identified genomic features of phytoplankton associated bacteria that differed to non-phytoplankton associated bacteria, including an enhanced ability to produce vitamins, a greater reliance on carbon compounds produced by specific phytoplankton hosts (e.g., DHPS, DMSP, putrescine) and the potential ability to directly attach to host cells. Moreover, we identify subtle differences between generalists and specialists that are a likely reflection of their different lifestyles. Indeed, generalists may be able to use motility to shift between phytoplankton partners as new hosts are encountered, which may be particularly relevant during natural phytoplankton succession and bloom events69,70. These generalists potentially maintain versatile, or promiscuous, interactions with phytoplankton through the exchange of important metabolites (e.g., production of multiple vitamins, secondary metabolites, and siderophores), mechanisms to outcompete other bacteria (e.g., antimicrobial production), and the ability to move between phycospheres (motility and chemotaxis). Finally, while correlation does not mean causation, it is notable that environmental co-occurrence of the phytoplankton strains with the generalist and specialist bacterial associates identified in our experiment were present within oceanographic time-series data from the isolation site, pointing towards wider ecological relevance of our findings. Taken together, our results suggest that the genomic specificities and differences between generalists and specialists likely govern the assembly of phytoplankton-associated bacterial communities and highlight the different ecological strategies used by bacteria that interact closely with phytoplankton in the ocean.

Methods

Algal isolation

Two consecutive vertical plankton net drops, to a maximum of 30 m deep, were performed at an oceanographic time-series site (Port Hacking) located ~5 km off the eastern coastline of Australia (34°07'06“S 151°13'09“E) on the 29th February and 9th April 2019. Using a 20 µm mesh net, equipped with a 180 mL plastic collection tub at the end of the net. The collected plankton was diluted using two successive 50:50 dilutions into sterile seawater prior to phytoplankton isolations. Single cells phytoplankton and their associated bacteria were then isolated from the diluted haul using elongated glass capillary pipettes. A total of 15 monoclonal phytoplankton cultures, with their associated bacterial consortia, were established in 500 µL of sterile seawater (from Port Hacking), which was then diluted with sterile f/2 media in a sterile 24-well plate (i.e., f/4 media). Our goal was to begin to measure the microbiome of each phytoplankton as rapidly as practical after sampling, to gain insight into the native phycosphere and characterise how these communities evolve and potentially stabilizes after isolation. The first sampling point after seven days was chosen based on preliminary tests that determined when biomass was sufficient to sub-sample for genomic analyses, while still retaining enough culture to continue with the time series. At day seven, we collected three 110 µL subsamples from each culture for subsequent DNA extraction and characterisation of the microbial community29. On the same day (day 7), the remaining algal culture was diluted back to 1 mL using f/2 media and split into triplicate flasks with an additional 5 mL of media. On day 10, the cultures were subsampled for subsequent DNA extraction, and on day 15 another 10 mL of f/2 media was added to each replicate algal flask. On day 20, the flasks were sub-sampled for DNA extraction again. Algal cultures were then sub-cultured routinely every 2 weeks by transferring 1 mL into 40 mL of sterile algal media, and cultures were always sub-cultured 5 days before subsequent subsampling of DNA (on days 40, 200, and 400).

Phytoplankton identification: DNA extraction, PCR and phylogenetic analysis

Phytoplankton were taxonomically classified through the amplification and sequencing of several marker genes. Briefly, phytoplankton cells were pelleted, and DNA extracted using the PowerWater Kit (Qiagen) following the manufacturer’s recommendation. The entire 18S rRNA gene was amplified using the eukaryotic primers 18Smoon-F (5’-ACCTGGTTGATCCTGCCAG-3’) and 18Smoon-R (5’-TGATCCTTCYGCAGGTTCAC-‘3) as previously described71,72, the LSU gene was amplified using D1Rf (5’-ACCCGCTGAATTTAAGCATA-3’) and D3Br (5’-TCGGAGGGAACAGCTACTA-3’) primer set73. For three phytoplankton isolates for which the full 18S rRNA amplification did not produce usable results, we amplified the ITS sequence as well using the primer set ITS-F (5’-CSMACAACGATGAAGRRCRCAGC-3’) and ITS-R (5’-TCCCDSTTCRBTCGCCVTTACT-3’) as previously described74. The resulting PCR products were Sanger sequenced at the Australian Genome Research Facility (AGRF). The BLASTn tool was then used to compare the resulting sequence to the National Centre for Biotechnologies Information (NCBI) database and assign taxonomy to the isolated phytoplankton (Supplementary Data 1) based on the percentage of similarity to the deposited sequences.

Microbiome tracking algal isolates

After the end of the experiment, the DNA in all microbiome samples from the 15 phytoplankton isolates was extracted using a low input physical lysis method28. Replicate samples from the February plankton haul (n = 6) and from the April plankton haul and April bulk seawater (n = 6) were also extracted. Replicate DNA extraction blanks (n = 3) were extracted together with each batch of algal samples. The extracted DNA was stored at –80 °C prior to sequencing.

Amplicon sequencing

To characterize bacterial community composition associated with the 15 phytoplankton isolates through time, the 16S rRNA V1-V3 region was amplified using Illumina barcoded primers 27 forward (5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-AGAGTTTGATCMTGGCTCAG-3’), and 519 reverse (5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-GWATTACCGCGGCKGCTG-3’). The PCR used crosslinked MilliQ water, Velocity DNA polymerase and buffer (Bioline, as per manufacturer’s instructions), as well as dNTPs and 1 µL BSA added per sample to decrease the adhesion of DNA to the crosslinked PCR tubes. A larger volume of DNA was added to the PCR reaction for the earlier time points of the timeseries, whereby 4 μL of DNA input was used for days 7, 10, and 20 and 2 μL of DNA input was used for blanks and days 40, 200, and 400. PCR were performed using the following thermocycling conditions: 98 °C for 2 min, followed by: 3 cycles of [98 °C for 30 s—46 °C for 30 s—72 °C for 30 s], 3 cycles where the annealing temperature increased to 48 °C, and 24 cycles where the annealing temperature increased to 50 °C, followed by 72 °C for 10 min. Amplicons were subsequently sequenced on the Illumina MiSeq platform (2 × 300 bp) at AGRF.

Amplicon sequencing data processing

The open-source program R was used for amplicon read processing, statistical analysis, and production of figures75. Illumina paired R1 and R2 reads were processed using the DADA2 pipeline76. Reads with any ‘N’ bases were removed and bacterial V1-V3 primer sequences were truncated using cutadapt77. R1 and R2 were trimmed to remove low quality terminal ends (trunc(R1 = 275; R2 = 240)), in order to produce the highest number of merged reads after learning error rate. Data were merged into one file before removing chimera sequences, using the DADA2 removeBimeraDenovo script at the stringent threshold minFoldParentOverAbundance = 1. The remaining ASVs were annotated using both the SILVA nr_v138.1 and the GTDB database (v95) with the assignTaxonomy script implemented in DADA2 with a 50% probability cut-off.

Quality filtration and contamination removal

When analysing results derived from low input samples, it is critical to identify and subsequently remove contaminants derived from either DNA extraction or PCR reagents. Hence, we sequenced two PCR negative control, one for each amplification batch and several extraction negative control (3 for each extraction batch) to identify and remove external contaminants. Any ASVs found in the PCR negative control were removed from all samples belonging to the same amplification batch. For the extraction negative control, contaminant removal was done according to Bramucci et al.28. Between 21 and 35 contaminant ASVs were identified and removed from each sample; however, these ASVs were not abundant as their total abundance represented less than 0.45% of any given sample. The resulting contaminants-free ASV table was quality filtered to remove all ASVs not annotated at the Kingdom level, as well as ASVs annotated either as chloroplast or mitochondria. This stringent quality filtering process resulted in 5909 unique ASVs present in the whole dataset. Finally, two samples with less than 1400 reads were removed from further analysis.

Classification of bacterial ASVs as phytoplankton associates

The clean ASV table was normalised using Cumulative Sum Scaling (CSS) using the metagenomeSeq package (v. 1.34.0) in R75. The CSS normalized dataset was then used to identify different categories of phytoplankton associates. We defined bacterial ASVs present in the microbiome of three or more phytoplankton isolates as generalists and those present in the microbiomes of only one or two algal isolates as specialists. Importantly, only ASVs that (i) made up > 0.001% RA of the community within a given sample and (ii) were present in at least 4 of the 6 samples taken on day 200 (n = 3) and day 400 (n = 3) of the timeseries were categorized as phytoplankton associates and either generalists or specialists. All the other ASVs were considered as bacteria that are not involved in sustained association with phytoplankton and classified either as transient or initial seawater. Bacteria classified as transient were either not present at day 200 and/or 400, or appeared sporadically (in less than 4 out of 6 samples). ASVs that were present in the plankton tow or seawater, but were not identified within the phytoplankton microbiomes (above the 0.001% RA threshold) at any time-point were classified as initial seawater. To focus on the most abundant fraction of the community, only ASVs with a normalized relative abundance > 0.1 % were retained for the subsequent analysis, resulting in a final ASVs table of 2823 ASVs.

Bacterial isolates for whole genome sequencing

Bacteria were isolated from the microbiome of all of phytoplankton isolates 20 days after isolation, and the isolation strategies and conditions have been previously described71. Briefly bacteria were isolated and purified by repeated plating on either 1% or 10% Marine Agar (Difco). For taxonomic classification, each of the isolated bacteria was then grown in 100% Marine Broth (Difco) for 24 h, after which the DNA was extracted using the PowerWater Kit (Qiagen) following manufacturer’s recommendation, and the full 16S rRNA gene amplified using the 27f (5’-AGAGTTTGATCMTGGCTCAG-‘3)—1492r (5’-ACCTTGTTACGACTT-‘3) primer pair as previously described71. Amplicons were then sequenced using Sanger sequencing at AGRF. Clean sequences were then blasted against the microbiome ASVs final table to identify possible generalists, specialists or transients. The seven isolates identified as generalists or specialists were grown in 20 mL of 100% Marine Broth for 12 h and then centrifuged at 1500 × g for 10 min. The supernatant was discarded and the pellet was flash frozen for DNA extraction using the physical lysis approach detailed above28. Whole genome sequencing was then performed using a MiSeq platform (2 × 150 bp) at AGRF.

Metagenome analysis: reads clean-up

To increase the probability of re-assembling generalist and specialist genomes from the phytoplankton microbiome a subset of 4 phytoplankton and 4-time points per phytoplankton were chosen for subsequent metagenomic analysis (54 samples). Phytoplankton were chosen based on their abundance at the location of isolation and the high relative abundance of few generalists and specialists in their microbiome. Samples libraries for metagenomic analysis were prepared following a previously optimized protocol28 and sequenced on a Nextseq6000 Illumina sequencer (2 × 150 bp) at AGRF. Raw reads were quality checked and adaptors removed using cutadapt77. Quality filtered reads from extraction blanks were assembled using metaSpades78 to create a contigs library of contaminants. This library was then used to quality filter the samples through the removal of reads that mapped to the contaminant library (<1%). Dedupe from the BBtools package was then used to assess the presence and remove duplicate reads for clean metagenomic reads (sourceforge.net/projects/bbmap/). SingleM ((https://github.com/wwood/singlem), database release Dec 2022) was used to assign taxonomy to the clean and deduplicated reads.

Quality filtered and deduplicated reads for each sample were co-assembled using megahit79 to increase the sequencing depth and probability of reassembling high-quality MAGs. The resulting contigs were binned using metabat280 in Anvio-6.281. MAGs were refined in Anvio and exported for a final quality check with CheckM82. FastANI within the Anvio platform was used to identify the degree of similarity between genomes and dereplicate similar genomes (similarity >98.9 %).

Generalist and specialist MAGs, taxonomy and functional annotation

To identify potential generalist, specialist or transient MAGs, the final ASVs table was mapped against the high-quality MAGs using Blastn (pident > 99, length > 350). MAGs were assigned as a generalist, specialist or transient if their percentage of identity was above 99.5% on > 350 bp fragment. Taxonomy was assigned to the MAGs using the GTDB-tk package and then compared to the taxonomy assigned to the relative ASVs to ensure that the taxonomy matched. The alignment of single core protein performed as part of the classify_wf of GTDB_tk was retrieved and a phylogenetic tree of the identified MAGs was constructed in Genious Prime. Coverage of MAGs in each metagenome was calculated with CoverM v0.6.1 (https://github.com/wwood/CoverM). MAGs were then functionally annotated using EnrichM (https://github.com/geronimp/enrichM) and DRAM83. The KO matrix derived from the two different analysis was compared and then transformed into a presence/absence KO table including all genomes. Biosynthetic gene clusters (BGCs) were predicted using antiSMASH (v.7) software to assess the potential production of secondary metabolite for each MAGs84. Iron related genes were identified using FeGenie (v. 1.2)85. Potential doubling time for each bacterial MAGs was estimated using the gRodon package in R40.

Environmental distributions

CoverM v0.6.1 (https://github.com/wwood/CoverM) was used to calculate the relative abundance of high-quality bacterial MAGs and two Eukaryotic MAGs – Cylindrotheca cf. closterium and Pycnococcus cf. provasolii, based on metagenomic data derived from monthly surface samples collected at three national reference sites along the east coast of Australia (North Stradbroke Island NSI, Port Hacking PHB, and Maria Island MAI)27. The two phytoplankton were chosen based on the quality of their reassembled genome (completion > 70%, redundancy <10%) from our metagenomic dataset. Quality-filtered reads from monthly samples collected between 2015 and 2020 were used to calculate the relative abundance of the identified MAGs at these three sites. To identify possible co-occurrence of specialist, generalist and transient bacteria with their Eukaryotic host the relative abundance of each bacterial MAGs was correlated with the relative abundance of their associated phytoplankton MAGs. The co-occurrence of bacteria and phytoplankton in the environment was calculated using the cor_test option within the rstatix program in R using Spearman's correlation and Bonferroni adjusted p-value.

Statistical analysis

Alpha diversity for the 16S rRNA analysis was calculated with the alpha diversity function within the microbiome package86 in R. Alpha diversity was calculated on rarefied data using the rarefy option in phyloseq87. The statistical differences between alpha diversity indexes were tested using a non-parametric Mann-Whytney (MW) test with adjusted P values (using the Bonferroni method). To explore the similarity between the microbiome of each phytoplankton isolate and the seawater communities over time, the CSS normalized ASVs counts were transformed (square-root) to calculate Bray-Curtis distances using the metaMDS function within the Vegan package in R88. Results were then visualized on a non-metric multidimensional scaling (nMDS) plot. Statistical differences between the seawater microbiome and the phytoplankton isolates microbiome were assessed with a PERMANOVA using the Adonis function (Vegan package) in R with 999 permutations of the Bray-Curtis matrices. Statistical differences between the average genome length, predicted GC content and predicted doubling time between groups (generalists, specialists and transients) were calculated using a Mann-Whitney U test as part of the rstatix package in R, followed by false discovery rates (FDR) corrected p-values. A permutational analysis of variance (PERMANOVA) was used to determine the difference in functional profiles based on a KO presence/absence table. This KO table was used to calculate the enrichment of functional pathways based on the KEGG annotations, between generalists (n = 9), specialists (n = 14) vs. the transients’ bacteria (n = 10), using a Fisher test, as implemented in EnrichM for metabolic enrichment. P-values were then adjusted using the false discovery rate (“FDR”).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.