Acquisition and evolution of the neurotoxin domoic acid biosynthesis gene cluster in Pseudo-nitzschia species

He, Ziyan; Xu, Qing; Chen, Yang; Liu, Shuya; Song, Huiyin; Wang, Hui; Leaw, Chui Pin; Chen, Nansheng

doi:10.1038/s42003-024-07068-7

Download PDF

Article
Open access
Published: 23 October 2024

Acquisition and evolution of the neurotoxin domoic acid biosynthesis gene cluster in Pseudo-nitzschia species

Ziyan He^1,2,3,4^na1,
Qing Xu^1,2,4,5^na1,
Yang Chen^1,2,3,4,
Shuya Liu^1,2,4,
Huiyin Song^1,2,4,
Hui Wang^1,2,4,
Chui Pin Leaw⁶ &
…
Nansheng Chen ORCID: orcid.org/0000-0002-0615-7215^1,2,4

Communications Biology volume 7, Article number: 1378 (2024) Cite this article

5533 Accesses
9 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Of the hitherto over 60 taxonomically identified species in the genus of Pseudo-nitzschia, 26 have been confirmed to be toxigenic. Nevertheless, the acquisition and evolution of the toxin biosynthesis (dab) genes by this extensive group of Pseudo-nitzschia species remains unclear. Through constructing chromosome-level genomes of three Pseudo-nitzschia species and draft genomes of ten additional Pseudo-nitzschia species, putative genomic integration sites for the dab genes in Pseudo-nitzschia species were explored. A putative breakpoint was observed in syntenic regions in the dab gene cluster-lacking Pseudo-nitzschia species, suggesting potential independent losses of dab genes. The breakpoints between this pair of conserved genes were also identified in some dab genes-possessing Pseudo-nitzschia species, suggesting that the dab gene clusters transposed to other loci after the initial integration. A “single acquisition, multiple independent losses (SAMIL)” model is proposed to explain the acquisition and evolution of the dab gene cluster in Pseudo-nitzschia species.

Analysis genome of Pseudotaxus chienii reveals insights into the origin and evolution of taxane biosynthesis

Article Open access 18 December 2025

Mechanistic insights into the effects of Tris-2-butoxyethyl phosphate on multiple cancers using network toxicology and molecular docking

Article Open access 06 August 2025

Subtelomeric assembly of a multi-gene pathway for antimicrobial defense compounds in cereals

Article Open access 07 May 2021

Introduction

Species of the diatom genus Pseudo-nitzschia have attracted worldwide attention since the first reported Pseudo-nitzschia poisoning incident in 1987 at Prince Edward Island, Canada, which resulted in at least three deaths and 107 cases of illness, and the identification of the neurotoxic glutamate receptor agonist domoic acid (DA) as the toxin synthesized by P. multiseries^1,2. DA synthesized by Pseudo-nitzschia species has been found to be transferred in the marine food web during bloom events, which may cause illness or death for marine organisms and humans, leading to what is called amnesic shellfish poisoning (ASP)². An unprecedented bloom of another toxigenic P. australis in the spring of 2015 resulted in the largest recorded outbreak of DA along the North American west coast, ranging from the Aleutian Islands of Alaska, USA, to the Baja peninsula, Mexico, and resulted in prolonged closures of razor clam, rock crab, and Dungeness crab fisheries³. Similar neurotoxic symptoms have also been observed in animals, including birds and marine mammals⁴. In addition, numerous harmful algal bloom (HAB) events caused by Pseudo-nitzschia species have continued to erupt globally, exerting severe impacts on fisheries and the aquaculture industry^5,6,7.

Pseudo-nitzschia blooms have motivated intensive research on the identification and geographical distribution of new Pseudo-nitzschia species, and on the elucidation of mechanisms of DA biosynthesis in Pseudo-nitzschia species, which resulted in the identification of key genes of the DA biosynthetic pathway in P. multiseries through comparative gene expression analysis⁸. This study successfully identified a gene cluster, dab, with four protein-coding genes: dabA (terpene cyclase), dabB (hypothetical protein), dabC (ɑ-dependent dioxygenase), and dabD (CYP450), which encode key enzymes catalyzing reactions extending from the substrate glutamate and geranyl pyrophosphate (GPP) to isodomoic acid A in the genome of P. multiseries⁸. Following this success, dab genes were readily identified in P. multistriata through genomic analysis and the search for homologous genes⁸, and in P. australis and P. seriata through transcriptomic analyses^8,9,10. The characterization of the dab gene clusters has also facilitated research on the impact of ocean warming and acidification on DA biosynthesis¹¹.

Pseudo-nitzschia is a genus with over 60 cosmopolitan species, up to 50% of which are capable of synthesizing DA^8,12,13. This may explain the wide distribution of DA from the Pacific subarctic (58°N) to the Southern Ocean (66°S)¹⁴. Notably, Pseudo-nitzschia species that are capable of producing DA are not necessarily phylogenetically closer. Furthermore, of the 26 DA-producing Pseudo-nitzschia species, 24 were found to produce DA in some studies, but not in others², suggesting that DA biosynthesis is not only species-specific but also strain-specific. Thus, whether the dab gene cluster was acquired independently in different Pseudo-nitzschia species, or it was acquired in the common ancestor of all Pseudo-nitzschia species remains undetermined. Ascertaining the acquisition and evolution of the dab gene cluster in Pseudo-nitzschia species will help explain why certain Pseudo-nitzschia species are toxic, while others are not, and why some species are reported to exhibit both toxic and non-toxic traits^2,12,13.

In this project, we constructed the genomes of 13 Pseudo-nitzschia species, including high quality chromosome-level genomes of three species (P. delicatissima, P. multiseries, and P. pungens) and draft-quality genomes of ten species (P. americana, P. brasiliana, P. cuspidata, P. galaxiae, P. hainanensis, P. micropora, P. multistriata, P. sabit, Pseudo-nitzschia sp. (CNS00097), and Pseudo-nitzschia sp. (CNS01031)), and identified the dab gene clusters. Of the three Pseudo-nitzschia species selected for constructing chromosome-level genomes, P. multiseries has been proven to possess the dab gene cluster and is toxic⁸, while P. pungens was shown to be both toxic and non-toxic, depending on the strains², P. delicatissima, on the other hand, was found to be toxic, weakly toxic, or non-toxic¹⁵. Comparative genomic analysis of syntenic regions harboring dab gene clusters, coupled with phylogenetic analysis of dab genes and structural modeling of the proteins encoded by these dab genes, enabled us to identify the putative integration sites of the dab gene cluster in the common ancestor of Pseudo-nitzschia species. Our findings suggested a “single acquisition, multiple independent losses (SAMIL)” model for the evolution of the dab gene cluster in Pseudo-nitzschia species.

Results

Construction and comparative analysis of chromosome-level genomes of three Pseudo-nitzschia species

To obtain high-quality reference genomes for the three Pseudo-nitzschia species, we generated 15.06 Gb (~442×), 20.40 Gb (~261×), and 31.87 Gb (~119×) of PacBio data (Supplementary Table 1). These data were assembled to produce the initial versions of the whole genome assemblies, which were 34.06 Mb, 67.09 Mb, and 252.10 Mb in sizes for P. delicatissima, P. pungens, and P. multiseries, respectively. These assembly results were generally consistent with the estimated sizes obtained from genome survey analysis (Supplementary Fig. 1; Supplementary Table 2).

We further carried out Hi-C (High-through chromosome conformation capture) analysis (Supplementary Table 3) to construct chromosome-level assemblies, which were 34.06 Mb, 67.11 Mb, and 252.35 Mb for P. delicatissima, P. pungens, and P. multiseries, respectively (Fig. 1; Table 1). The genome assemblies of P. delicatissima, P. pungens and P. multiseries contained 11, 12, and 11 chromosomes (Supplementary Table 4) and were highly contiguous, with 96.48%, 99.14%, and 94.90% of genome contigs anchored to chromosomes, respectively. The assemblies also had high contig N50 values of 2.90 Mb, 1.66 Mb, and 0.80 Mb for P. delicatissima, P. pungens, and P. multiseries, respectively (Table 1). BUSCO assessment indicated that the genome assemblies were 73.93%, 80.53%, and 81.52% complete for P. delicatissima, P. pungens, and P. multiseries, respectively (Supplementary Table 5).

**Fig. 1: Construction and comparative analysis of chromosome-level genome assemblies of three *Pseudo-nitzschia* species.**

Table 1 Statistics of Pseudo-nitzschia genome assembly and annotation

Full size table

We explored why the genome sizes of three Pseudo-nitzschia species were so different, ranging from 34.06 Mb (P. delicatissima) to 252.10 Mb (P. multiseries). Annotation of the assembled genomes revealed that P. multiseries harbors a greater proportion of repetitive elements (68.67%) compared to P. delicatissima (1.73%) and P. pungens (34.91%). Among all types of repetitive elements, long terminal repeats (LTRs) showed the highest proportions, accounting for 0.84%, 22.94%, and 58.45% of the genomes of P. delicatissima, P. pungens, and P. multiseries, respectively (Table 2). This suggested that differential levels of LTRs were primarily responsible for the size differences among these three Pseudo-nitzschia species. Despite the differences in genome sizes, the numbers of protein-coding genes (PCGs) in these three species were similar, being 14,375, 15,472, and 18,649 in P. delicatissima, P. pungens, and P. multiseries, respectively (Table 1; Supplementary Table 6). Notably, the number of PCGs predicted in P. multiseries in this study (18,649) was similar to that annotated in a different P. multiseries strain (19,703). Most PCGs of the P. delicatissima, P. pungens, and P. multiseries genomes (93.83%, 91.44%, and 88.49%, respectively) were functionally annotated (Supplementary Table 7).

Table 2 Repetitive contents identified in the three Pseudo-nitzschia genomes

Full size table

Although these three species belong to the same genus, their genomes showed extensive differences with varying numbers of chromosomes, indicating many intra-chromosomal fusion or divisions events occurred during speciation. While 11 chromosomes were constructed for P. delicatissima and P. multiseries, 12 chromosomes were constructed for P. pungens (Fig. 1). Pairwise comparative analyses revealed only small-scale syntenic blocks between the genomes of P. delicatissima and P. pungens (Fig. 1G) and between the genomes of P. delicatissima and P. multiseries (Fig. 1H), suggesting complex evolutionary relationships with numerous chromosomal exchanges. Nevertheless, relatively large syntenic block sizes were found between genomes of P. pungens and P. multiseries (Fig. 1I), consistent with their close positions in the phylogenetic tree (Supplementary Fig. 2). In particular, chromosome Pmu6 of P. multiseries and chromosome Ppu8 of P. pungens showed nearly perfect collinearity (Fig. 1I-J). A total of 10,755 collinear gene pairs were found between P. pungens and P. multiseries, 9076 collinear gene pairs between P. delicatissima and P. pungens, and 9272 collinear gene pairs between P. delicatissima and P. multiseries (Fig. 1G–I).

To explore gene family dynamic changes in evolution, PCGs of P. delicatissima, P. pungens, and P. multiseries were compared with those of nine other phytoplankton species, including Aureococcus anophagefferens, Thalassiosira oceanica, Thalassiosira pseudonana, Skeletonema marinoi, Seminavis robusta, Phaeodactylum tricornutum, Fragilariopsis cylindrus, P. multistriata, and P. multiseries (Supplementary Fig. 3A). Phylogenetic analysis using 381 single-copy orthologous genes from these 12 species showed that Pseudo-nitzschia species were tightly clustered with F. cylindrus, as expected. The divergence time of Pseudo-nitzschia species from their nearest node was estimated to be approximately 53.2 million years ago (MYA) (Supplementary Fig. 3A). Comparative analysis of the genes of P. delicatissima, P. pungens, P. multiseries, and P. multistriata revealed 7136 shared gene families (Supplementary Fig. 3B).

Comparative analysis of Pfam domains contained in the PCGs of the 12 species revealed that many gene families encoded by the Pseudo-nitzschia genomes were different from those in other species (Supplementary Fig. 3C). Notably, the family of genes encoding gametolysin peptidase M11 was substantially expanded in Pseudo-nitzschia genomes, with 17, 50, and 33 members in P. delicatissima, P. pungens, and P. multiseries, respectively (Supplementary Fig. 3C, Supplementary Fig. 3D). These gametolysin peptidase M11 domain-containing proteins, which have their origin in the ancestor of green plants and chromists, have been demonstrated to degrade cell wall in Chlamydomonas¹⁶.

Annotation of dab gene clusters and phylogenetic analysis of dab genes

From the three chromosome-level Pseudo-nitzschia genomes constructed in this study, the dab gene cluster was identified only in P. multiseries (Fig. 2A; Supplementary Table 8), which is the first Pseudo-nitzschia species known to possess the dab gene cluster⁸. The dab gene cluster in P. multiseries consisted of four genes: dabA, dabB, dabC, and dabD, as reported previously⁸. However, the dab gene cluster was not identified in either P. delicatissima or P. pungens (Fig. 2A; Supplementary Table 8), despite previous studies showing the presence of both toxic and non-toxic strains in these two Pseudo-nitzschia species^2,8,15. Searches for individual genes with similarity to those in the dab gene cluster did not yield meaningful hits in the assemblies of P. delicatissima and P. pungens. The dab gene cluster was identified in the draft genomes of P. multistriata strains in this study; this species has previously been shown to possess the dab gene cluster⁸. The cluster was also detected in the assembly of P. cuspidata (Fig. 2A; Supplementary Table 8), a species previously reported as DA-producing^2,17. No dab gene clusters were found in the draft genomes of other Pseudo-nitzschia species investigated in this study, despite some being reported as DA producers¹⁸. Interestingly, the Pseudo-nitzschia species found to possess the dab gene cluster in this study (P. multiseries, P. multistriata, and P. cuspidata), as well as two other species (P. australis and P. seriata) identified through transcriptomic analysis, were distributed throughout the phylogenetic tree of Pseudo-nitzschia species (Supplementary Fig. 2), suggesting that the acquisition and evolution of the dab gene cluster are intricate.

**Fig. 2: Phylogenetic and syntenic analysis of DA biosynthesis genes.**

Chemical analysis performed on 13 strains of nine Pseudo-nitzschia species (P. multiseries, P. multistriata, P. delicatissima, P. americana, P. micropora, P. pungens, P. galaxiae, P. brasiliana, and Pseudo-nitzschia sp. (CNS00097)) revealed that DA was detected only in the strain of P. multiseries. No DA was detected in the P. multistriata strains, even though five P. multistriata strains were analyzed (CNS00142, CNS00781, CNS00965, CNS01237, CNS01424). DA analysis was not performed for strains of P. cuspidata, P. hainanensis, P. sabit, and Pseudo-nitzschia sp. CNS01031 due to their unavailability, as these strains unfortunately died off during the study period (Supplementary Table 8).

Acquisition and evolution of dab gene cluster

To determine how dab gene clusters were acquired by Pseudo-nitzschia species, we hypothesized that the dab gene cluster was acquired by a common ancestor of Pseudo-nitzschia species and was subsequently lost independently in different species. Based on the phylogenetic positions of Pseudo-nitzschia species that possess the dab gene cluster (Supplementary Fig. 2), we predicted that the phylogenetic relationships of the dab genes would align with the phylogenetic relationships of these species. To test this hypothesis, we conducted phylogenetic analyses of chloroplast genomes (cpDNAs) to infer the phylogenetic relationships (Fig. 2A), as rDNA molecular markers were relatively short and some species exhibited cryptic lineages with ambiguous relationships¹⁹. Separate phylogenetic analysis of the dab genes was also performed. As expected, the phylogenetic inferences for DabA (Fig. 2B), DabB (Fig. 2C), DabC (Fig. 2D), and DabD (Fig. 2E) were highly coherent with the phylogenetic relationships of the Pseudo-nitzschia species harboring the dab gene cluster (Fig. 2A; Supplementary Fig. 3). These results suggested that the dab gene cluster in these three Pseudo-nitzschia species most likely originated from a common ancestor that gained the dab genes via a horizontal gene transfer (HGT) event. In fact, the proteins encoded by dabB exhibited high similarity to proteins found in prokaryotic organisms (Fig. 2C), supporting the hypothesis of HGT. Furthermore, modeling of the 3D structures of the proteins encoded by dabB and dabC revealed high structural similarity to proteins found in prokaryotic organisms (Fig. 2F), further supporting the gene recruitment from prokaryotic donors via HGT²⁰. Proteins encoded by dabA identified in P. multistriata (CNS00142), P. multiseries (CNS00149), and P. cuspidata (CNS00150) in this study showed high structural similarity to DabA of P. multiseries reported previously²⁰, and proteins encoded by dabD identified in this study showed high structural similarity to Cytochrome P450 of Fragilariopsis cylindrus (Fig. 2F), indicating a high conservation of this gene in diatoms.

The hypothesis further predicted that the dab gene clusters would be present in homologous integration sites (i.e., sharing breakpoints) in the genomes of all Pseudo-nitzschia species that possess the dab gene clusters, and that deletion events (i.e., dab gene cluster losses) would be evident in the genomes of species that do not possess the dab gene clusters. Our comparative analysis of the genomes of P. multiseries, P. multistriata, and P. cuspidata, however, indicated that the genomic segments harboring the dab gene clusters were not homologous (Supplementary Fig. 4). We therefore anticipated that the dab gene clusters might have moved away from their “original” integration sites due to genome rearrangements, as all Pseudo-nitzschia species have undergone extensive shuffling and transposition (Fig. 1J).

We then searched for the putative “original” integration sites of the dab gene cluster in Pseudo-nitzschia genomes, including those with and without the dab gene clusters. To trace the loss of the dab gene cluster, we aligned the genome scaffold sequences containing the dab gene from P. multiseries, P. multistriata, and P. cuspidata with the genomic sequences of other Pseudo-nitzschia using BLAST. Our results showed that the dab gene cluster in P. cuspidata was flanked by two genes, Pde10317 (a CoA-binding protein) and Pde10588 (a protein kinase) (Fig. 2G, H). The upstream gene, Pde10317, encodes a CoA-binding protein. BLAST analysis indicated that homologous genes of Pde10317 can be found in all Pseudo-nitzschia species and many closely related species, including genera such as Nitzschia, Fragilariopsis, Phaeodactylum, and Cylindrotheca. Conversely, Pde10588 encodes a protein kinase, with homologs present in essentially all eukaryotes.

Through comparison with the SWISS-PROT database, it was found that the CoA-binding domain-containing gene exhibits homology with the Escherichia coli yccU (Percent Identity = 57.1%, Alignment Length = 137)^21,22. Interestingly, this gene pair was successfully identified in all 13 Pseudo-nitzschia genomes analyzed in this project as well as in many other genomes (Fig. 2), suggesting that the putative original integration sites of the dab gene clusters remained intact in these genomes. Pairwise comparison of the genomic regions harboring the original integration sites revealed that the genomic regions between the gene pair were rather different (Supplementary Fig. 5), suggesting that the losses of the dab gene cluster were species-specific and varied among the genomes.

Discussion

HABs not only impact marine ecosystems by generating overwhelming biomass^23,24, but also pose risk to humans and other life forms by producing deadly toxins²⁵. The discovery of key genes in the DA biosynthetic pathway in P. multiseries in 2018⁸ represented a milestone in understanding the mechanisms underlying the impact of ocean warming and acidification on DA production, and also paved the way for identifying DA biosynthesis genes in other algal species^26,27. For example, kainic acid biosynthesis (kab) gene clusters were successfully identified in the genomes of two known kainic acid producers, Digenea simplex and Palmaria palmata, through homology-based searches. This was based on the observation that kainic acid and domoic acid share structural similarities (both compounds are commonly referred to as kainoids)²⁸. Similarly, the dab biosynthesis gene cluster was found in the red alga Chondria armata, the seaweed from which DA was first characterized. Phylogenetic analysis suggested that the core DA biosynthesis genes were acquired through horizontal gene transfer²⁹.

To facilitate the identification of dab genes across various Pseudo-nitzschia species, we assembled chromosome-level genomes for three species (P. multiseries, P. delicatissima, and P. pungens) and draft genomes for ten additional species (including two undescribed species), encompassing both toxigenic and non-toxic Pseudo-nitzschia species². Insights gained from these genomes through comparative genomics offer valuable understanding of the evolutionary processes underlying the divergence in toxin production capabilities. Additionally, this information sheds light on how genomic context affects gene expression strength and its potential link to disruptions in genes or regulatory elements. Furthermore, these findings provide a point of reference for future transcriptomic studies and enhance our ability to investigate the genetic basis of toxin production in Pseudo-nitzschia species.

The “single acquisition, multiple independent losses (SAMIL)” model of dab gene cluster evolution in Pseudo-nitzschia species

Through mining of these genomes, we identified the DA biosynthesis dab gene cluster in only three Pseudo-nitzschia species. We proposed a putative original integration site and breakpoints for the dab gene cluster in species that lack DA metabolism. Surprisingly, the dab gene cluster-containing Pseudo-nitzschia species was low, accounting for only 20% (3 out of 13) of the species analyzed in this study. This is unexpected given that up to 50% of Pseudo-nitzschia species are known to synthesize DA^8,12,13. Nevertheless, numerous studies have reported both toxic and non-toxic strains w Pseudo-nitzschia species². It is possible that some species might have been erroneously annotated in earlier studies due to their morphological resemblance to toxigenic Pseudo-nitzschia species, particularly those identified solely through light microscopy². The challenge of accurate species identification, especially in earlier morphology-based studies, has been recognized as a significant issue in accurately identifying Pseudo-nitzschia species^30,31.

Based on the results of this study, we propose a “single acquisition, multiple independent losses (SAMIL)” model to explain the evolution of the dab gene cluster in Pseudo-nitzschia. The model suggests that the common ancestor of all Pseudo-nitzschia species acquired the dab gene cluster either via HGT from prokaryotes or through endosymbiosis with red algae (Fig. 3). Notably, since the sequence of CYP450 encoded by dabD is closest to other diatom P450s (Fig. 2E), we propose that the ancestral gene cluster gained via HGT contained only dabA, dabB, and dabC, with dabD being laterally acquired or “hijacked” for DA biosynthesis²⁹. After this acquisition, the dab gene cluster underwent independent evolutionary changes in different Pseudo-nitzschia lineages. Under neutral or negative selection, the dab gene cluster may have been lost from the genomes of some non-DA-producing Pseudo-nitzschia lineages (e.g., P. sabit, P. americana). In addition, intraspecific variation in DA metabolism within “toxic” lineages may result from recent evolutionary events (such as pseudogenization and gene flow), likely driven by adaptations to environmental stresses (geographical or physiological adaptations¹⁰). These may have led to the loss of the dab gene cluster in certain strains/populations, ultimately affecting their DA metabolism (P. delicatissima, P. pungens) study.

The location of the dab gene cluster integration in P. cuspidata may represent a putative alternative insertion site. Our comparative analysis of the genomic locations of dab gene cluster-containing Pseudo-nitzschia species and those lacking the dab gene cluster identified a potential “original” dab gene cluster insertion site in P. cuspidata. This original insertion site is flanked by two highly conserved genes: a CoA-binding protein and a protein kinase (Fig. 3). Notably, dabD is also flanked by these two genes. Our analysis further suggested that the dab gene cluster in Pseudo-nitzschia species might have undergone genome rearrangements (e.g., segmental or dispersal duplications), as evidenced by the different genomic locations of the dab gene clusters in P. multiseries, P. multistriata, and P. cuspidata (Fig. 3). Nevertheless, it cannot be ruled out that the original dab gene cluster integration site might be located at other genomic regions, such as the shared region between P. multiseries and P. multistriata (in blue, Supplementary Fig. 4). Additional data from more dab gene cluster-containing Pseudo-nitzschia species are needed to accurately pinpoint the bona fide original dab gene cluster integration site. Figure 3 was created with BioRender.com.

The dab gene cluster and differential DA biosynthesis in Pseudo-nitzschia species

Research has shown that varying toxic characteristics can exist among different strains of the same Pseudo-nitzschia species, with some species comprised both toxic and non-toxic strains. This variability may be due to the inherent genetic diversity within Pseudo-nitzschia species³². Intraspecific variation in toxicity can be attributed to differences in the presence or absence of key biosynthetic genes within the dab gene clusters, or the presence or absence of entire dab gene clusters, as well as variations in regulatory elements controlling toxin production. In this study, we observed variability in the presence or absence of the entire dab gene cluster among different Pseudo-nitzschia species. However, no genomic differences were observed regarding the presence or absence of dab genes or the entire dab gene cluster among different strains of the same species. Of the 13 Pseudo-nitzschia species analyzed, seven (P. brasiliana, P. cuspidata, P. delicatissima, P. galaxiae, P. multiseries, P. multistriata, and P. pungens) were reported as toxigenic^2,18. Despite this, the dab gene cluster was only identified in strains of three Pseudo-nitzschia species (P. cuspidata, P. multiseries, and P. multistriata). It is possible that strains of P. brasiliana, P. delicatissima, P. galaxiae, and P. pungens analyzed in this study lack the dab gene cluster, but it may exist in strains of these species from other geographical regions. Therefore, the presence of the dab gene cluster may be both species- and strain-specific.

Strains of Pseudo-nitzschia that carry the dab gene cluster may not necessarily produce DA. The toxicity of Pseudo-nitzschia species is influenced by a variety of ecological factors, both biotic and abiotic, which affect transcriptional and translational regulations, as suggested previously^11,33,34. For instance, grazing by copepods and Calanus copepodite has been reported to induce toxin production in various Pseudo-nitzschia species³⁵. Abiotic factors such as temperature, salinity, light exposure, pH, and concentrations of macronutrients and trace elements also significantly affect the toxin production capabilities of Pseudo-nitzschia species³⁶. Even if all strains of a particular Pseudo-nitzschia species carry the dab gene cluster, different strains may still exhibit varying levels of DA production due to the differential regulation of the dab gene cluster at the transcriptional or translational levels. In fact, under similar culture conditions in this study, while strains of P. multiseries were found to produce DA, P. multistriata strains did not, despite possessing the dab gene cluster. This highlights the complex interplay between genetic diversity and environmental factors in shaping the toxicity of Pseudo-nitzschia species. Further research is needed to understand the molecular mechanisms governing toxin production at both the species and strain levels. Such insights are crucial for predicting and managing the potential impacts of harmful algal blooms on marine ecosystems and public health.

Conclusion

Our genomic analyses have enabled the identification of the dab gene cluster in P. cuspidata. The dab gene cluster identified in this study and previous studies⁸ suggests a possible shared evolutionary origin in Pseudo-nitzschia. The complex evolutionary trajectory of the dab gene clusters in Pseudo-nitzschia can be succinctly summarized using the SAMIL model, which illustrates the dynamic gains and losses of DA biosynthetic capabilities throughout evolutionary history.

Methods

Pseudo-nitzschia strain isolation and identification

Strains of Pseudo-nitzschia analyzed in this study were isolated from various coastal regions in China, including Qinhuangdao, Hebei Province; Jiaozhou Bay, Shandong Province; Nan’ao Island and Qinzhou Bay, Guangdong Province (Supplementary Table 8). Cells were isolated using micropipette and subsequently cultured in L1 medium (1‰ volume fraction Na₂SiO₃⋅9H₂O was added). Cultures were maintained at 19 °C, under cool white fluorescent illumination with an intensity of 30 μmol·m⁻²·s⁻¹ and a 12:12 photoperiod. Strains were identified based on their morphological features and the molecular sequences of the nuclear-encoded ITS (ITS1-5.8S-ITS2) region of the ribosomal RNA gene. The identity of some strains were described previously, including CNS00043, CNS00055, CNS00089, CNS00110, CNS00141, CNS00153, CNS00154, CNS00155, and CNS00156³⁷, CNS00133³⁸, CNS00135³⁹, and CNS00090, CNS00097, CNS00130, CNS00138, CNS00142, CNS00159 and CNS00150⁴⁰. This study identified eleven Pseudo-nitzschia species, including P. americana (CNS00108, CNS00138 and CNS00151), P. brasiliana (CNS01029), P. cuspidata (CNS00150), P. delicatissima (CNS00130 and CNS00135), P. galaxiae (CNS01037 and CNS01103), P. hainanensis (CNS00090), P. micropora (CNS00133 and CNS01024), P. multiseries (CNS00149, CNS00159 and CNS00771), P. multistriata (CNS00107, CNS00142, CNS00781, CNS00965, CNS01237 and CNS01424), P. pungens (CNS00043, CNS00055, CNS00089, CNS00110, CNS00141, CNS00153, CNS00154, CNS00155, CNS00156, CNS00973, CNS01028, CNS01042 and CNS01238) and P. sabit (CNS00609), plus two previously undescribed species Pseudo-nitzschia sp. strains CNS00097 and CNS01031 (Supplementary Table 9).

DNA extraction, sequencing, and draft genome assemblies

During the exponential growth phase, healthy algal cells were harvested for DNA extraction by centrifugation at 8000 rpm for 5 min. Total DNA was extracted using the DNAsecure Plant Kit (Tiangen Biotech, Beijing, China) following the manufacturers’ instructions. DNA concentration was quantified with the Qubit® DNA Assay Kit using a Qubit® 3.0 Flurometer (Invitrogen, USA). Genomic DNA from each sample (0.2 µg) was fragmented to approximately 350 bp by sonication (Covaris S220, Covaris, USA). The DNA fragments were then harvested, end-polished, A-tailed, and ligated with adapters for Illumina sequencing, followed by PCR amplification. PCR products were purified with the AMPure XP system (Beckman Coulter, Beverly, USA). Qualified libraries were sequenced on the NovaSeq 6000 PE150 platform (Illumina, San Diego, CA, USA) at Novogene (Beijing, China). Draft genomes were assembled using SPAdes v3.14.0⁴¹ and Platanus-allee v2.2.2⁴², with default parameters. Sequencing data and draft genome assembly statistics are provided in Supplementary Table 10.

DNA/RNA extraction and sequencing for chromosome-level genomes

The methods for DNA/RNA extraction and sequencing for the three chromosome-level genomes followed procedures described in our previous study⁴³. Algal cells were processed by first powdering them in liquid nitrogen. The powdered cells were then mixed with lysis buffer and RNase A, followed by incubation. After centrifugation, the supernatants were subjected to magnetic-bead-based DNA purification and washing. High-quality DNA was obtained and sequenced using both PacBio long-reads and MGI short-reads technologies. Libraries were prepared with specific kits and sequenced on the MGISEQ-2000-PE150 and PacBio Sequel SMRT Cell platforms.

For Hi-C analysis, algal samples were cross-linked, fragmented, and labeled with biotin. DNA fragments were captured using magnetic beads, processed and sequenced on the MGISEQ-2000-PE150 platform to analyze chromatin loci proximity.

Total RNA was extracted using CTAB method. Samples were ground, mixed with CTAB lysis buffer, and subjected to several centrifugation steps. Total RNA with high quality was extracted through cetyltrimethylammonium bromide (CTAB) methods for transcriptome sequencing through MGI and PacBio platform. The mRNA sequencing library (DNBSEQ) was constructed and detected by DNF-471 Standard Sensitivity RNA Analysis Kit (AATI), BGI Optimal two-module mRNA library kit (BGI), BGI Plug-In Adapter Kit (BGI) and Qubit® ssDNA Assay Kit (Invitrogen). The sequencing library (DNBSEQ) was constructed and detected by MGIEasy Universal DNA Library Prep Set (MGI), Qubit™ dsDNA BR Assay Kit (Invitrogen) and Qubit® ssDNA Assay Kit (Invitrogen), and the sequencing strategy was MGISEQ-2000-PE150. The PacBio continuous long reads (CLR) sequencing library was constructed and detected by the SMRTbell Express Template Prep Kit 2.0 (PacBio), Qubit dsDNA HS Assay Kit 2.0 (Invitrogen) and HS Large Fragment 50 KB Analysis Kit (Agilent Technologies). The sequencing strategy was PacBio Sequel SMRT Cell 1 M.

Genome size estimation and genome assembly

To estimate the sizes of the genomes of the Pseudo-nitzschia strains, K-mer analysis was performed using Illumina DNA sequencing data and Jellyfish (v 2.1.4)⁴⁴. For each dataset, k-mers were counted and aggregated (jellyfish count option), and histograms were generated with the “-histo” command. The resulting histograms were then used to estimate genome length and heterozygosity using GenomeScope v 2.0⁴⁵. The genome sizes, repetitive contents, and heterozygosity levels of three Pseudo-nitzschia species P. delicatissima, P. pungens, and P. multiseries were estimated by carrying out genome survey analysis using Illumina DNA sequencing results of 27.85 Gb, 44.24 Gb, and 27.98 Gb for P. delicatissima, P. pungens, and P. multiseries, respectively (Supplementary Table 11). The genome sizes of 35 Mb, 78 Mb, and 267 Mb were estimated for P. delicatissima, P. pungens, and P. multiseries, respectively (Supplementary Fig. 1; Supplementary Table 2); the heterozygosity levels of P. delicatissima, P. pungens and P. multiseries were estimated to be 1.37%, 0.82% and 0.87%, respectively; the repetitive contents were estimated to be 11.00%, 42.10% and 71.00%, respectively (Supplementary Table 2). Thus, the genome sizes of the three Pseudo-nitzschia species varied substantially.

For whole genome assemblies, long reads generated from PacBio Sequel platform were assembled using Mecat2⁴⁶ to generate initial versions of the genome assemblies, which were subsequently polished using Pilon⁴⁷. The contigs were assembled into chromosomes through Hi-C analysis using Juicer⁴⁸ and 3D-DNA⁴⁹ by default parameters. These genome assemblies were further visualized, and error corrected using JucieBox⁵⁰. The completeness of the assembled genomes was evaluated using BUSCO v3 (eukaryota_odb9)⁵¹.

Genome annotation

Repeat sequences in the genome of Pseudo-nitzschia species were annotated using a comprehensive analysis combining homology-based and de novo prediction methods. The homology-based approach utilized the RepBase v21.12 library⁵², employing RepeatMasker v4.0.7 (http://www.repeatmasker.org/). This strategy enabled the identification of sequences sharing similarity with known repetitive elements. For de novo prediction, the RepeatScout tool⁵³, Piler⁵⁴, and LTR_FINDER v1.07⁵⁵ were employed to construct a de novo repeat sequence library. Subsequently, RepeatMasker was applied to predict de novo repeat elements within the genome using the constructed library. To identify tandem repeat sequences, Tandem Repeats Finder v4.09²⁶ was used.

The genome sequences were used for homology-based, de novo, and transcriptome-based gene predictions. First, the homologous proteins from seven species including Arabidopsis thaliana, Fragilariopsis cylindrus, Phaeodactylum tricornutum, Seminavis robusta, Thalassiosira pseudonana, P. multistriata, and Skeletonema marinoi, were used to identify proteins in the repeat masked Pseudo-nitzschia species genome reference sequence with MAKER software (v.2.31.8)²⁷. From the homology predictions, a subset of 2000 well-supported genes were selected as a training set. De novo gene prediction software, Augustus⁵⁶ and SNAP⁵⁷, were then trained using this set to improve accuracy in predicting gene structures. To enhance annotation accuracy, RNA-seq and Iso-Seq transcriptomic data were incorporated. RNA-seq data were aligned to the genome using HISAT2 (v2.1.0)⁵⁸, and transcripts were assembled using StringTie (v1.3.4d)⁵⁹. PASA pipeline (https://github.com/PASApipeline/PASApipeline) was subsequently employed for correction and refinement of the transcript information, resulting in a high confidence set of transcripts. The homology-based, de novo, and transcriptome-based were integrated using Maker for a second round of gene structure prediction and consolidation.

Protein-Protein BLAST v2.2.31 was then used to assess putative protein functions in each Pseudo-nitzschia species by comparing the protein sequences given by MAKER to the protein sequences from the annotated genomes. The predicted protein coding genes were functionally annotated based on several publicly available databases including Swissprot⁶⁰, TrEMBL (http://www.uniprot.org/), InterPro⁶¹, GO⁶², KEGG⁶³, and NR (http://www.ncbi.nlm.nih.gov/protein/) databases.

The tRNA sequences in the genome were identified utilizing the tRNAscan-SE 1.3.1 software⁶⁴. To annotate rRNA sequences, we performed a BLASTN search against rRNA sequences of a set of closely related species. The alignment results were used to identify and annotate rRNA sequences within the genome. The miRNA and snRNA was predicted using Rfam (v1.0.4)⁶⁵.

Collinearity analysis

To investigate the differences in chromosome numbers among the three species P. delicatissima, P. pungens, and P. multiseries, MCscanX in JCVI (python -m jcvi.compara.catalog ortholog)⁶⁶ was utilized for pairwise comparisons between the species. To reveal the collinearity relationship between P. delicatissima, P. pungens, and P. multiseries, genome-wide synteny analysis was performed using the MCScanX pipeline within JCVI utility libraries (python -m jcvi.graphics.karyotype)⁶⁶.

Gene family and evolutionary analysis

Gene families were identified between three Pseudo-nitzschia species and other nine species (A. anophagefferens, T. oceanica, T. pseudonana, S. marinoi, S. robusta, P. tricornutum, F. cylindrus, P. multistriata, and P. multiseries) using Orthofinder2⁶⁷. Each of the gene sets from the 12 species was filtered using a condition where, if there were multiple alternatively spliced transcripts in a gene, only the longest transcript was retained. The similarity of protein sequences was assessed by all-versus-all BLASTP with an E-value 1e-6.

A total of 381 single-copy orthologous genes were identified. Protein alignments for individual orthogroup were done by MAFFT⁶⁸. The alignments were processed to remove sites with over 50% gaps and remove sequences shorter than 50% of the alignment length. To infer the species tree, we used both concatenation and multispecies coalescent approach. The concatenated dataset included all the 381 loci and was analyzed using IQ-Tree⁶⁹ with ModelFinder model selection. To assess branch supports, we carried out ultrafast bootstrap and SH-aLRT analyses (both with 1000 replicates).

To estimate the divergence time of different species, the mcmctree program from PAML (v.4.9)⁷⁰ was used. Calibration points for the divergence analysis were obtained from the TimeTree database (http://www.timetree.org/). The resulting phylogenetic tree was presented with 95% highest posterior density (HPD) interval.

Gene family contraction and expansion analysis were performed using the CAFE5⁷¹ software based on gene family clustering data. A stochastic birth and death model were proposed in CAFE to estimate the λ value.

Pfam domains for the 12 species were identified using InterProscan⁷² and visualized using R packages pheatmap and ggplot2. Phylogenetic trees of genes, including PF05548 domains, were reconstructed using IQ-TREE⁶⁹ with 1000 bootstrap replications.

Annotation of DA biosynthesis gene clusters

For the identification of dab genes in the chromosome-level genome assemblies of P. delicatissima, P. pungens and P. multiseries, protein sequences of DabA, DabB, DabC, and DabD from P. multiseries⁸ were used as queries to search for the candidate genes, dabA, dabB, dabC, and dabD.

For other Pseudo-nitzschia species, the draft genomes were assembled using SPAdes v3.14.0⁴¹ and Platanus-allee v2.2.2⁴², with default parameters. BLAST was utilized to probe for dab genes with queries mentioned above. Gene annotation was performed using Genewise^73,74 and ORFfinder (https://www.ncbi.nlm.nih.gov/orffinder/), with the amino acid sequences of dab genes from P. multiseries serving as a reference. To ensure that no dab genes were missed due to incomplete assembly results, we also used the gene sequences or transcript sequences of dabA, dabB, dabC, and dabD as queries, using the BWA v0.7.17⁷⁵ MEM algorithm to search for potential genomic sequences containing dab genes in assembled genomes.

Phylogenetic analysis of DA biosynthesis genes

In the phylogenetic analysis of dab genes, other than obtaining the published dab gene sequences from different Pseudo-nitzschia species^8,9,10, we also incorporated radA, radC, kabA, and kabC from the red algae^28,29. Protein sequences of the assembled transcripts were obtained by using the TransDecoder program (https://github.com/TransDecoder/TransDecoder). Furthermore, sequences were also acquired by querying the NCBI NR database using dab genes as references. Chloroplast genomes (cpDNAs) and ITS (ITS1-5.8S-ITS2) sequences of Pseudo-nitzschia were obtained from NCBI database and our previous study⁴⁰. Sequence alignments were performed using MUSCLE⁷⁶, followed by gap removal with trimAl⁷⁷. Maximum Likelihood (ML) trees were constructed with IQ-Tree⁶⁹, incorporating ModelFinder for optimal model selection. Branch supports were evaluated through ultrafast bootstrap, SH-aLRT (both with 1000 replicates), and an Approximate Bayes test. Protein structures were predicted using ColabFold v1.5.2⁷⁸ and visualized with Pymol⁷⁹. Collinearity analysis was performed using Mauve⁸⁰ and Dotter⁸¹.

Toxin analysis

The cultivation parameters for Pseudo-nitzschia strains followed the conditions described above. Algal cells were harvested during the late-exponential growth phase, approximately 21 days post-inoculation. For DA quantification, cell counts were performed by subsampling 5 mL of algal cells and preserved in Lugol’s iodine solution, samples were kept at 4 °C for subsequent microscopic counts. The collection, preparation, and analytical conditions of different samples refer to the previous study⁸². The DA detection was conducted at the Institute of Oceanology, Chinese Academy of Sciences, using ultra-performance liquid chromatography-electron spray ionization-quadrupole-time of flight-mass spectrometry (UPLC-ESI-Q-TOF-MS) (Bruker, Germany). The DA standard was sourced from the National Marine Environmental Monitoring Center (Standard Material Number: GBW(E)100782). A total of 13 Pseudo-nitzschia strains were tested for DA production (Supplementary Table 8), including P. multiseries (CNS00159), P. multistriata (CNS00142, CNS00781, CNS00965, CNS01237, CNS01424), P. delicatissima (CNS00130), P. americana (CNS00138), P. micropora (CNS00133), P. pungens (CNS00141), P. galaxiae (CNS01103), P. brasiliana (CNS01029), and Pseudo-nitzschia sp. (CNS01031).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The sequencing results (raw data) have been submitted to NCBI, and the BioProject number is PRJNA1054348. The assembled genome has been deposited in the NCBI assembly with the accession number GCA_037355735.1, GCA_037355745.1 and GCA_037355755.1. The genome annotation information has been uploaded to Figshare: 10.6084/m9.figshare.27254001.

References

Bates, S. et al. Pennate diatom Nitzschia pungens as the primary source of domoic acid, a toxin in shellfish from eastern Prince Edward Island, Canada. Can. J. Fish. Aquat. Sci. 46, 1203–1215 (1989).
Article CAS Google Scholar
Bates, S. S., Hubbard, K. A., Lundholm, N., Montresor, M. & Leaw, C. P. Pseudo-nitzschia, Nitzschia, and domoic acid: New research since 2011. Harmful Algae 79, 3–43 (2018).
Article PubMed Google Scholar
McCabe, R. M. et al. An unprecedented coastwide toxic algal bloom linked to anomalous ocean conditions. Geophys. Res. Lett. 43, 10,366–310,376 (2016).
Article Google Scholar
Cook, B. I., Ault, T. R. & Smerdon, J. E. Unprecedented 21st century drought risk in the American Southwest and Central Plains. Sci. Adv. 1, e1400082 (2015).
Article PubMed PubMed Central Google Scholar
Bresnan, E. et al. Diversity and regional distribution of harmful algal events along the Atlantic margin of Europe. Harmful Algae 102, 101976 (2021).
Clark, S. et al. Pseudo-nitzschia bloom dynamics in the Gulf of Maine: 2012–2016. Harmful Algae 88, 101656 (2019).
Article CAS PubMed PubMed Central Google Scholar
Husson, B., Hernández-Fariñas, T., Le Gendre, R., Schapira, M. & Chapelle, A. Two decades of Pseudo-nitzschia spp. blooms and king scallop (Pecten maximus) contamination by domoic acid along the French Atlantic and English Channel coasts: Seasonal dynamics, spatial heterogeneity and interannual variability. Harmful Algae 51, 26–39 (2016).
Article CAS PubMed Google Scholar
Brunson, J. K. et al. Biosynthesis of the neurotoxin domoic acid in a bloom-forming diatom. Science 361, 1356–135 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hardardóttir, S. et al. Transcriptomic responses to grazing reveal the metabolic pathway leading to the biosynthesis of domoic acid and highlight different defense strategies in diatoms. Bmc Mol. Biol. 20, https://doi.org/10.1186/s12867-019-0124-0 (2019).
Lema, K. A. et al. Inter- and Intra-Specific Transcriptional and Phenotypic Responses of Pseudo-nitzschia under Different Nutrient Conditions. Genome Biol. Evol. 11, 731–747 (2019).
Article CAS PubMed PubMed Central Google Scholar
Xu, D. et al. Plastic responses lead to increased neurotoxin production in the diatom Pseudo-nitzschia under ocean warming and acidification. ISME J. 17, 525–536 (2023).
Article CAS PubMed PubMed Central Google Scholar
Lelong, A., Hégaret, H., Soudant, P. & Bates, S. S. Pseudo-nitzschia (Bacillariophyceae) species, domoic acid and amnesic shellfish poisoning: revisiting previous paradigms. Phycologia 51, 168–216 (2012).
Article CAS Google Scholar
Trainer, V. L. et al. physiological ecology, phylogeny, toxicity, monitoring and impacts on ecosystem health. Harmful Algae 14, 271–300 (2012).
Article Google Scholar
Silver, M. W. et al. Toxic diatoms and domoic acid in natural and iron enriched waters of the oceanic Pacific. Proc. Natl Acad. Sci. 107, 20762–20767 (2010).
Article CAS PubMed PubMed Central Google Scholar
Zhu, Z. et al. Understanding the blob bloom: Warming increases toxicity and abundance of the harmful bloom diatom Pseudo-nitzschia in California coastal waters. Harmful Algae 67, 36–43 (2017).
Article CAS PubMed Google Scholar
Kinoshita, T., Fukuzawa, H., Shimada, T., Saito, T. & Matsuda, Y. Primary structure and expression of a gamete lytic enzyme in Chlamydomonas reinhardtii: similarity of functional domains to matrix metalloproteases. Proc. Natl Acad. Sci. USA 89, 4693–4697 (1992).
Article CAS PubMed PubMed Central Google Scholar
Teng, S. T. et al. Toxic bloom of Pseudo-nitzschia cuspidata (Bacillariophyceae) and domoic acid contamination of bivalve molluscs in Malaysia Borneo. Toxicon 202, 132–141 (2021).
Article CAS PubMed Google Scholar
Bates, S. S., Lundholm, N., Hubbard, K. A., Montresor, M. & Leaw, C. P. Toxic and harmful marine diatoms. in Diatoms: fundamentals and applications (eds. Seckbach, J. & Gordon, R.) 389–434 https://doi.org/10.1002/9781119370741.ch17 (Wiley, 2019).
Niu, B.-B. et al. Morphology, molecular phylogeny and biogeography revealed two new Pseudo-nitzschia (Bacillariophyceae) species in Chinese waters. J. Syst. Evol. https://doi.org/10.1111/jse.13016 (2023).
Chekan, O. R., McKinnie, S. M. K., Noel, J. P. & Moore, B. S. Algal neurotoxin biosynthesis repurposes the terpene cyclase structural fold into an N-prenyltransferase. Proc. Natl Acad. Sci. USA 117, 12799–12805 (2020).
Article CAS PubMed PubMed Central Google Scholar
Butland, G. et al. Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature 433, 531–537 (2005).
Article CAS PubMed Google Scholar
Hiyama, T. B. et al. Structural basis of CoA recognition by the Pyrococcus single-domain CoA-binding proteins. J. Struct. Funct. Genomics 7, 119–129 (2006).
Article CAS PubMed Google Scholar
Allen, J. I., Smyth, T. J., Siddorn, J. R. & Holt, M. How well can we forecast high biomass algal bloom events in a eutrophic coastal sea? Harmful Algae 8, 70–76 (2008).
Article CAS Google Scholar
Anderson, D. M. Approaches to monitoring, control and management of harmful algal blooms (HABs). Ocean Coast. Manag. 52, 342–347 (2009).
Article PubMed PubMed Central Google Scholar
Van Dolah, F. M. Marine algal toxins: Origins, health effects, and their increased occurrence. Environ. Health Perspect. 108, 133–141 (2000).
Article PubMed PubMed Central Google Scholar
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27, 573–580 (1999).
Article CAS PubMed PubMed Central Google Scholar
Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinforma. 12, 491 (2011).
Article Google Scholar
Chekan, J. R. et al. Scalable Biosynthesis of the Seaweed Neurochemical, Kainic Acid. Angew. Chem. Int Ed. Engl. 58, 8454–8457 (2019).
Article CAS PubMed PubMed Central Google Scholar
Steele, T. S. et al. Domoic acid biosynthesis in the red alga Chondria armata suggests a complex evolutionary history for toxin production. Proc. Natl Acad. Sci. USA 119, https://doi.org/10.1073/pnas.2117407119 (2022).
Li, Y. et al. Pseudo-nitzschia simulans sp. nov. (Bacillariophyceae), the first domoic acid producer from Chinese waters. Harmful Algae 67, 119–130 (2017).
Article CAS PubMed Google Scholar
Lim, H. C. et al. Phylogeny and species delineation in the marine diatom Pseudo-nitzschia (Bacillariophyta) using cox1, LSU, and ITS2 rRNA genes: A perspective in character evolution. J. Phycol. 54, 234–248 (2018).
Article PubMed Google Scholar
Lelong, A., Hégaret, H., Soudant, P. & Bates, S. S. Pseudo-nitzschia(Bacillariophyceae) species, domoic acid and amnesic shellfish poisoning: revisiting previous paradigms. Phycologia 51, 168–216 (2019).
Article Google Scholar
Sun, J. et al. Effects of changing pCO2 and phosphate availability on domoic acid production and physiology of the marine harmful bloom diatom Pseudo-nitzschia multiseries. Limnol. Oceanogr. 56, 829–840 (2011).
Article CAS Google Scholar
Kelly, K. J. et al. Simulated upwelling and marine heatwave events promote similar growth rates but differential domoic acid toxicity in Pseudo-nitzschia australis. Harmful Algae 127, 102467 (2023).
Article CAS PubMed PubMed Central Google Scholar
Harethardottir, S. et al. Dangerous Relations in the Arctic Marine Food Web: Interactions between Toxin Producing Pseudo-nitzschia Diatoms and Calanus Copepodites. Mar. Drugs 13, 3809–3835 (2015).
Article PubMed Central Google Scholar
Sobrinho, B. F. et al. Growth, Toxin Production and Allelopathic Effects of Pseudo-nitzschia multiseries under Iron-Enriched Conditions. Mar. Drugs 15, https://doi.org/10.3390/md15100331 (2017).
Chen, Y., Wang, Y., Liu, K., Liu, F. & Chen, N. Development of a high-resolution molecular marker for tracking Pseudo-nitzschia pungens genetic diversity through comparative analysis of mitochondrial genomes. J. Appl. Phycol. 33, 2283–2298 (2021).
Chen, Y., Cui, Z., Liu, F. & Chen, N. Mitochondrial genome and phylogenomic analysis of Pseudo-nitzschia micropora (Bacillariophyceae, Bacillariophyta). Mitochondrial DNA Part B 6, 2035–2037 (2021).
Article PubMed PubMed Central Google Scholar
He, Z. et al. Complete mitochondrial genome of the harmful algal bloom species Pseudo-nitzschia delicatissima (Bacillariophyceae, Bacillariophyta). Mitochondrial DNA Part B 6, 2541–2543 (2021).
Article PubMed PubMed Central Google Scholar
He, Z. et al. Comparative Analysis of Pseudo-nitzschia Chloroplast Genomes Revealed Extensive Inverted Region Variation and Pseudo-nitzschia Speciation. Front. Marine Sci. 9, https://doi.org/10.3389/fmars.2022.784579 (2022).
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kajitani, R. et al. Platanus-allee is a de novo haplotype assembler enabling a comprehensive access to divergent heterozygous regions. Nat. Commun. 10, 1702 (2019).
Article PubMed PubMed Central Google Scholar
Liu, S., Xu, Q. & Chen, N. Expansion of photoreception-related gene families may drive ecological adaptation of the dominant diatom species Skeletonema marinoi. Sci. Total Environ. 897, 165384 (2023).
Article CAS PubMed Google Scholar
Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Article CAS PubMed PubMed Central Google Scholar
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 11, 1432 (2020).
Article CAS PubMed PubMed Central Google Scholar
Xiao, C. L. et al. MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads. Nat. Methods 14, 1072–1074 (2017).
Article CAS PubMed Google Scholar
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9, e112963 (2014).
Article PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
Article CAS PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst. 3, 99–101 (2016).
Article CAS PubMed PubMed Central Google Scholar
Manni, M., Berkeley, M. R., Seppey, M. & Zdobnov, E. M. BUSCO: Assessing Genomic Data Quality and Beyond. Curr. Protoc. 1, e323 (2021).
Article PubMed Google Scholar
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).
Article PubMed PubMed Central Google Scholar
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21, i351–i358 (2005).
Article CAS PubMed Google Scholar
Edgar, R. C. & Myers, E. W. PILER: identification and classification of genomic repeats. Bioinformatics 21, i152–i158 (2005).
Article CAS PubMed Google Scholar
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
Article PubMed PubMed Central Google Scholar
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Article CAS PubMed PubMed Central Google Scholar
Johnson, A. D. et al. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24, 2938–2939 (2008).
Article CAS PubMed PubMed Central Google Scholar
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
Article CAS PubMed PubMed Central Google Scholar
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
Boeckmann, B. et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365–370 (2003).
Article CAS PubMed PubMed Central Google Scholar
Apweiler, R. et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 29, 37–40 (2001).
Article CAS PubMed PubMed Central Google Scholar
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet 25, 25–29 (2000).
Article CAS PubMed PubMed Central Google Scholar
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28, 27–30 (2000).
Article CAS PubMed PubMed Central Google Scholar
Chan, P. P., Lin, B. Y., Mak, A. J. & Lowe, T. M. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 49, 9077–9096 (2021).
Article CAS PubMed PubMed Central Google Scholar
Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121–D124 (2005).
Article CAS PubMed Google Scholar
Tang, H. et al. Synteny and collinearity in plant genomes. Science 320, 486–488 (2008).
Article CAS PubMed Google Scholar
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Article PubMed PubMed Central Google Scholar
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Article CAS PubMed PubMed Central Google Scholar
Nguyen, L. T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
Article CAS PubMed Google Scholar
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Article CAS PubMed Google Scholar
Mendes, F. K., Vanderpool, D., Fulton, B. & Hahn, M. W. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36, 5516–5518 (2021).
Article PubMed Google Scholar
Quevillon, E. et al. InterProScan: protein domains identifier. Nucleic Acids Res. 33, W116–W120 (2005).
Article CAS PubMed PubMed Central Google Scholar
Birney, E., Clamp, M. & Durbin, R. GeneWise and genomewise. Genome Res. 14, 988–995 (2004).
Article CAS PubMed PubMed Central Google Scholar
Madeira, F. et al. Search and sequence analysis tools services from EMBL-EBI in 2022. Nucleic Acids Res. 50, W276–W279 (2022).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
Article PubMed PubMed Central Google Scholar
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
Article CAS PubMed PubMed Central Google Scholar
Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Article CAS PubMed PubMed Central Google Scholar
Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).
Article CAS PubMed PubMed Central Google Scholar
DeLano, W. L. Pymol: An open-source molecular graphics tool. CCP4 Newsl. Protein Crystallogr 40, 82–92 (2002).
Google Scholar
Darling, A. E., Mau, B. & Perna, N. T. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5, e11147 (2010).
Article PubMed PubMed Central Google Scholar
Sonnhammer, E. L. & Durbin, R. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 167, GC1–GC10 (1995).
Article CAS PubMed Google Scholar
Dong, H. C. et al. Occurrence of Pseudo-nitzschia species and associated domoic acid production along the Guangdong coast, South China Sea. Harmful Algae 98, 101899 (2020).
Article CAS PubMed Google Scholar
Basu, S. et al. Finding a partner in the ocean: molecular and evolutionary bases of the response to sexual cues in a planktonic diatom. N. Phytologist 215, 140–156 (2017).
Article CAS Google Scholar

Download references

Acknowledgements

This research was supported by the Strategic Priority Research Program of Chinese Academy of Sciences (XDB42000000), the Chinese Academy of Sciences Pioneer Hundred Talents Program (to Nansheng Chen), the Taishan Scholar Project Special Fund (to Nansheng Chen), the Qingdao Innovation and Creation Plan (Talent Development Program - 5th Annual Pioneer and Innovator Leadership Award to Nansheng Chen, 19-3-2-16-zhc), and the Natural Sciences and Engineering Research Council of Canada (NSERC). We are grateful to colleagues from the Jiaozhou Bay Marine Ecosystem Research Station for the opportunity to participate in the investigation expeditions. Data analysis was supported by Oceanographic Data Center, IOCAS.

Author information

These authors contributed equally: Ziyan He, Qing Xu.

Authors and Affiliations

CAS Key Laboratory of Marine Ecology and Environmental Sciences, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China
Ziyan He, Qing Xu, Yang Chen, Shuya Liu, Huiyin Song, Hui Wang & Nansheng Chen
Laboratory of Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266200, China
Ziyan He, Qing Xu, Yang Chen, Shuya Liu, Huiyin Song, Hui Wang & Nansheng Chen
College of Marine Science, University of Chinese Academy of Sciences, 10039, Beijing, China
Ziyan He & Yang Chen
Center for Ocean Mega-Science, Chinese Academy of Sciences, Qingdao, 266071, China
Ziyan He, Qing Xu, Yang Chen, Shuya Liu, Huiyin Song, Hui Wang & Nansheng Chen
Hubei Key Laboratory of Tumor Microenvironment and Immunotherapy, College of Basic Medical Sciences, China Three Gorges University, Yichang, 443002, China
Qing Xu
Bachok Marine Research Station, Institute of Ocean and Earth Sciences, University of Malaya, 16310, Bachok, Kelantan, Malaysia
Chui Pin Leaw

Authors

Ziyan He
View author publications
Search author on:PubMed Google Scholar
Qing Xu
View author publications
Search author on:PubMed Google Scholar
Yang Chen
View author publications
Search author on:PubMed Google Scholar
Shuya Liu
View author publications
Search author on:PubMed Google Scholar
Huiyin Song
View author publications
Search author on:PubMed Google Scholar
Hui Wang
View author publications
Search author on:PubMed Google Scholar
Chui Pin Leaw
View author publications
Search author on:PubMed Google Scholar
Nansheng Chen
View author publications
Search author on:PubMed Google Scholar

Contributions

N.C. designed the project and wrote the manuscript with contributions from all coauthors; Z.H. and Q.X. carried out analyses and prepared figures and tables; Y.C., S.L., H.S., and H.W. identified, maintained, and characterized Pseudo-nitzschia strains used in this study; C.P.L. wrote the manuscript.

Corresponding author

Correspondence to Nansheng Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editors: Wendy Mok, Luke Grinham and Johannes Stortz.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Figs. and tables

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

He, Z., Xu, Q., Chen, Y. et al. Acquisition and evolution of the neurotoxin domoic acid biosynthesis gene cluster in Pseudo-nitzschia species. Commun Biol 7, 1378 (2024). https://doi.org/10.1038/s42003-024-07068-7

Download citation

Received: 06 February 2024
Accepted: 14 October 2024
Published: 23 October 2024
Version of record: 23 October 2024
DOI: https://doi.org/10.1038/s42003-024-07068-7

This article is cited by

Conserved genetic markers reveal widespread diatom sexual reproduction in the global ocean
- Gust Bilcke
- Lucia Campese
- Wim Vyverman
Nature Communications (2025)