Background

Nematodes, representing one of the most abundant and functionally diverse animal groups on Earth, serve as critical drivers of ecosystem processes1,2. Among them, plant-parasitic nematodes (PPNs), encompassing over 4100 identified species, infest nearly all major crops across diverse agroecosystems, causing extensive damage and yield loss, and posing a severe threat to global food security3,4,5. PPNs typically penetrate plant cell walls using a specialized stylet, injecting saliva and secreting effectors that disrupt plant immunity and manipulate plant physiological metabolism to facilitate nutrient acquisition6,7,8. Furthermore, their interactions with other organisms or phytopathogens often lead to disease complexes, complicating disease management efforts9,10. Although fungal and bacterial biocontrol agents have demonstrated potential as sustainable tools for managing nematode diseases, the application of virus-based strategies against PPNs remains underexplored compared to their established application in controlling other plant pathogens and pests, including fungi, bacteria, and insects6,11,12.

The ecological interplay between nematodes and their associated viruses remains an underexplored frontier. Recent advances have gradually uncovered nematode-associated viruses, beginning with the identification of indigenous viruses in Caenorhabditis elegans (e.g., Orsay virus) and Caenorhabditis briggsae (e.g., Santeuil virus, Le Blanc virus, and Melnik virus), which established these organisms as valuable systems for studying virus-host interactions13,14,15,16. These discoveries support the hypothesis that nematode-infecting viruses may be more prevalent than previously recognized. Large-scale invertebrate virus investigations have further identified RNA virus infections in diverse nematodes, including snake-associated, mosquito-infecting, and mouse liver tissue-parasitizing species17,18.

The limited characterization of PPN-associated viral diversity represents one of the major constraints on the development of virus-based strategies against PPNs. To date, to the best of our knowledge, only 12 RNA viruses classified into six orders/families, including Phenuiviridae, Rhabdoviridae, Flaviviridae, Nyamiviridae, Picornavirales, and unclassified bunyaviruses, have been identified in four PPNs (soybean cyst nematode, sugar beet cyst nematode, potato cyst nematode, and root lesion nematode) (Supplementary Table 1)19,20,21,22,23,24. Additionally, some PPNs may serve as vectors for plant viruses, although such viruses typically do not replicate within nematodes25,26. Compared to the viral diversity of soil-free nematodes or animal-parasitic nematodes27,28, viruses of PPNs remain largely unexplored.

The discovery of viruses in nematodes could lead to new understandings and improved management for nematode-related diseases. In this study, we collected potato rot nematodes (PRNs) from infested sweet potatoes and conducted metatranscriptomic sequencing to investigate PRN-associated viruses. Furthermore, we analyzed transcriptome data from 25 PPN species across 536 public SRA runs to systematically uncover PPN-associated viral sequences. Our results revealed 94 new PPN or PPN-associated viruses within eighteen established families and six unclassified viral groups, which greatly highlight the PPN viral diversity and provide key insights into their evolution.

Results

Overview of PPN virome

To explore viral diversity in PPNs, we analyzed 536 publicly available RNA-seq SRA datasets (up to March 30, 2022), representing 25 PPN species. Additionally, we investigated the virome of ten field populations of potato rot nematode (PRN) collected from Lulong County, Qinghuangdao City, Hebei, China (Supplementary Fig. 1). Total RNA was extracted from the isolated PRNs and sequenced using an rRNA-depletion approach. Through the integration of viral sequences derived from both field-collected PRNs and public SRA datasets, we identified 94 RNA viruses with likely infection with PPNs (Table 1). Furthermore, 85 RNA viruses were excluded as potential contaminants based on the following criteria (Supplementary Table 2): (1) sequence similarity to known viruses from non-nematode hosts, such as fungi or oomycetes, and (2) co-occurrence of sequencing reads from these putative hosts within the same SRA run. These viruses, probably not hosted by PPNs, are generally designated as “PPN-associated virus” in this study as shown in Supplementary Table 2. A virus was classified as “PPN-infected” if it satisfied one of the following conditions: (1) sequence similarity to viruses of unknown host origin and a low ratio of non-nematode reads; or (2) phylogenetic clustering with previously reported or newly identified nematode viruses. The remaining viruses with uncertain hosts are also designated as “PPN-associated virus” in this study to distinguish them from confirmed “PPN viruses” (Table 1). Although experimental validation is required for definitive confirmation, such inferences are critical for viromic studies. Taxonomic summaries of the predominant organisms in some key SRA runs associated with these viruses are provided in the Supplementary Note 1.

Table 1 Summary of identified viruses that probably infected or are associated with PPNs

Among 94 PPN-associated viruses, 34 are negative-sense single-stranded ((‒)ss) RNA viruses, classified within Bunyaviricetes (n = 14), Rhabdoviridae (n = 5), Nyamiviridae (n = 5), Goujianvirales (n = 6), Qinviridae (n = 2), and Orthomyxoviridae (n = 2). Furthermore, 57 (+)ssRNA viruses are phylogenetically related to members of Endornaviridae (n = 3), Nodaviridae (n = 2), unclassified Martellivirales (n = 3), Tymoviridae (n = 1), Tombusviridae (n = 2), Flaviviridae (n = 8), Picornavirales (n = 23), Astroviridae (n = 3), and Bormycovirales (n = 12). Additionally, three dsRNA viruses were identified, belonging to Birnaviridae (n = 1) and Amalgaviridae (n = 2) (Table 1 and Fig. 1A). Although two species, Bursaphelenchus xylophilus (126 SRA runs) and Globodera pallida (92 SRA runs), accounted for 41% of the total dataset (218/536), relatively low RNA virus diversity was detected (Fig. 1B and Supplementary Table 3). This may be due to redundancy in the downloaded SRA runs, which stems from the widespread use of certain strains in experimental design. The nematodes Ditylenchus destructor and Heterodera glycines displayed high viral diversity (Fig. 1B). Most viruses detected in D. destructor (25/26) originated from the PRN field populations. Over 83% of identified nematode-associated viral genome segments (131/157) exhibited low amino acid (aa) identity (<50%) with previously reported viruses (Table 1, Supplementary Fig. 2). These findings imply that there is a yet largely unexplored viral diversity within field PPN populations. Furthermore, our analyses revealed that potato cyst nematode picorna-like virus 1 (PCNPV1) and potato cyst nematode rhabdovirus 1 (PCNRHV1) are among the most abundant viruses across all analyzed SRA runs (Fig. 1C), indicating that the PCN strain commonly used in general research may be infected with these viruses.

Fig. 1: Overview of PPN-associated RNA viruses.
figure 1

A Sankey diagram illustrating genome types, viral taxonomy, and hosts of PPN-associated RNA viruses. B Back-to-back bar plot comparing the total number of PPN-associated viruses (left) and corresponding SRA runs across nematode species. C Total numbers of SRA run for identified PPN-associated viruses. Only viruses with > 2 SRA runs are shown.

Novel negative-sense single-stranded RNA viruses

(i) Mononegavirales. (a) Rhabdoviridae

We identified five rhabdoviruses, including three newly identified ones—potato rot nematode rhabdovirus 1 (PRNRHV1), soybean cyst nematode rhabdovirus 1 (SCNRHV1), and soybean cyst nematode rhabdovirus 2 (SCNRHV2)—as well as two previously reported rhabdoviruses—soybean cyst nematode associated northern cereal mosaic virus rhabdovirus (SCNNCMV, HM849039.2) and potato cyst nematode rhabdovirus 1 (PCNRHV1, OP903920.1). The complete genome of PRNRHV1 is 13,233 nt in length and exhibits terminal complementarity at the 3ʹ and 5ʹ ends. It contains five non-overlapping open reading frames (ORF I–V), encoding the five canonical rhabdovirus proteins: nucleoprotein (N), phosphoprotein (P), membrane protein (M), glycoprotein (G), and polymerase (L) (Fig. 2). The intergenic regions of PRNRHV1 feature conserved transcription start (CAAAGACAACAA) and stop (UUAGAAAAAA) signals, consistent with those of canonical rhabdoviruses29. For SCNRHV1 and SCNRHV2, only partial genomes were assembled, with lengths of approximately 11,472 nt and 7640 nt, respectively, excluding predicted gap regions (Fig. 2). The L proteins of PRNRHV1, SCNRHV1, and SCNRHV2 share the highest similarity to PCNRHV1, exhibiting 41%, 45%, and 56% aa identities, respectively (Table 1). The three newly identified nematode rhabdoviruses cluster with previously reported SCNNCMV and PCNRHV1, forming a clade within Rhabdoviridae family with a UFBoot support value greater than 90%. The existence of this clade suggests significant host specificity among nematode-associated rhabdoviruses (Fig. 2).

Fig. 2: Nyami-like viruses and rhabdoviruses in PPNs.
figure 2

Phylogenetic tree based on the large (L) proteins of nyami-like viruses and rhabdoviruses is shown at the left panel. The genome organization of PPN-associated nyami-like viruses and rhabdoviruses is shown at the right panel. The complementary of 5 and 3 terminal and the conservation of the intergenomic region of PRNRHV1 are shown. Newly identified viruses in this study are marked with solid red star. The hollow red star represents SBCNNV1, which was previously misnamed as soybean cyst nematode nyami-like virus. Ns, gaps in the assembled viral genome. Animal nematode-associated viruses are followed by a nematode illustration. Q.pfam+F + I + R7 was selected as the best-fit model.

(b) Nyamiviridae

We identified five viruses belonging to Nyamiviridae from publicly available SRA data of PPNs, including two newly identified ones—soybean cyst nematode nyami-like virus 2 (SCNNV2) and stem and bulb nematode nyami-like virus 1 (SBNNV1)—as well as three previously reported ones—soybean cyst nematode midway virus (SCNMidV1, HM849038)20 and soybean cyst nematode nyami-like viruses (SCNNV, MG550266 and MG550268)23. Notably, we found that the RdRP encoded by two versions of SCNNV (MG550266 and MG550268) in GenBank share about 74% aa identity, suggesting they are distinct virus species. Given that we assembled a longer version of SCNNV (MG550268, 1815 nt) from a sugar beet cyst nematode library (SRR16675965), we renamed this partial sequence as sugar beet cyst nematode nyami-like virus 1 (SBCNNV1) in this study. Phylogenetic analysis of the RdRP domain in the L protein suggests that SBCNNV1 forms a lineage near SCNMidV1 and is distantly related to another clade containing SCNNV and SCNNV2. This finding suggests that the current Socyvirus genus could be expanded to include these nematode-specific nyami-like viruses (Fig. 2).

Another partial nyami-like virus, SBNNV1, is located at the stem of PPN nyami-like viruses with 100% UFBoot support (Fig. 2). The percentage of reads assigned to SBNNV1 (0.0009%) is low in the SRR8239758 run of Ditylenchus dipsaci, where 91% of reads are classified as D. dipsaci, and 5% remain unclassified (Supplementary Note 1). Although SBNNV1, together with Drosophila Inveresk nyamivirus, appears to cluster closely with the clade of nematode-specific nyamiviruses, it remains unclear whether the host of SBNNV1 is Drosophila sp., another insect, or a nematode.

(ii) Bunyaviricetes

We identified 14 bunya-like viruses, including 11 newly discovered viruses—eight potato rot nematode bunyaviruses (PRNBYV1–8), two soybean cyst nematode bunya-like viruses (SCNBLV2–3), and cereal cyst nematode bunyavirus 1 (CCNBYV1)—as well as three previously reported bunyaviruses: soybean cyst nematode associated Uukuniemi virus (SCNUUKV), soybean cyst nematode associated rice stripe virus (SCNRSV), and soybean cyst nematode bunya-like virus (SCNBLV) (Table 1). Based on phylogenetic analysis of the RdRP domain, seven bunya-like viruses, including previously reported SCNUUKV and SCNRSV, cluster within Phenuiviridae. PRNBYV4–6 cluster into a single subclade, and PRNBYV7 forms another subclade with SCNBLV3 and sugar beet cyst nematode virus 2. The remaining bunyaviruses could be grouped into two unknown clades. The first unknown clade includes PRNBYV3, SCNBLV2, and CCNBYV1. Another unknown clade consists of PRNBYV1 (12,325 nt) and PRNBYV2 (12,018 nt), both of which possess an L segment longer than those of SCNBLV (9,478 nt) and most other known bunyaviruses (Fig. 3). Apart from the L_protein_N domain (i.e., an endonuclease domain) and the Bunya_RdRp domain, which are typically found in the L protein of most bunyaviruses, we identified a putative cysteine proteinase domain near the C-terminus of the L proteins of PRNBYV1 and PRNBYV2. The conservation of the Asp, Cys, and His catalytic triad, along with the structural similarity observed among PRNBYV1, PRNBYV2, and sentrin-specific protease 8 (SENP8) (Fig. 4A, B), suggests that the cysteine proteinase domains in PRNBYV1 and PRNBYV2 are similar to SENP8, which processes ubiquitin-like proteins. Notably, the SENP-like proteases of PRNBYV1 and PRNBYV2 are phylogenetically close to sentrin-specific protease 8 of nematodes (Fig. 4C), implying that these viral SENP-like proteases may have been acquired from their hosts. Furthermore, the read-pair coverage among the genomes of PRNBYV1 and PRNBYV2 confirms the continuity of the assembly and rules out the possibility that the SENP-like protease domain resulted from a virus-host chimeric artifact (Fig. 3 and Supplementary Fig. 3). Based on these unique features, PRNBYV1 and PRNBYV2 could be classified into different taxa (e.g., family).

Fig. 3: Bunya-like viruses in PPNs.
figure 3

Phylogenetic tree based on the RdRP of bunya-like viruses is shown in the left panel. The genome organization of bunya-like viruses is shown in the right panel. The newly identified viruses in this study were labeled with a red star. Ns, gaps in the assembled viral genome. The putative cysteine protease domain of PRNBYV1 and PRNBYV2 was colored in light-green. Animal nematode-associated viruses are followed by a nematode illustration. Q.pfam+F + I + R9 was selected as the best-fit model.

Fig. 4: SENP-like proteases of PRNBYV1 and PRNBYV2.
figure 4

A Alignment of SENP-like proteases from PRNBYV1 and PRNBYV2 with the nematode and human homologs of SENP8. The putative catalytic triad (His-Asp-Cys) was marked with red circles. B Structural comparison of the SENP-like protease from PRNBYV2 and the human Den1/SENP8. The structure of SENP-like protease from PRNBYV2 was predicted with ColabFold and colored according to the pLDDT value. The human Den1/SENP8, shown in light-green, was identified as the most similar structure of SENP-like proteases from PRNBYV2 in the PDB database by Foldseek search. The TM-score, RMSD, and E-value were calculated with the Foldseek webserver. All the protein structure graphics were visualized with ChimeraX. C Phylogenetic tree of the SENP-like proteases from PRNBYV1 and PRNBYV2. The tree was constructed using IQ-TREE2 with “LG + I + G4” as the best-fit model and visualized using ggtree.

Among PRNBYV1–8, the relative abundances of PRNBYV3–5 were higher than other bunya-like viruses in the library of PRN field nematode populations (SRR28892574) (Supplementary Table 4). Combined with phylogenetic analysis, PRNBYV3–5 and CCNBYV1 are likely hosted by PPNs. As for PRNBYV1–2, they were exclusively and first detected in SRR28892574, where the species composition of the total reads is as follows: D. destructor (80%), unclassified reads (12%), Pristionchus spp. (8%), Caenorhabditis spp. (5%), Viruses (2%), Ascomycota (1%), and Bacteria (1%) (Supplementary Note 1). Given their low abundance (0.005%) and sequence homology to SENP-like protease from Pristionchus spp., it remains possible that these viruses may be associated with other contaminants (such as Pristionchus spp.) present in the library.

(iii) Articulavirales. (a) Orthomyxoviridae

We identified two new orthomyxo-like viruses, potato cyst nematode orthomyxo-like virus 1 (PCNOV1) and Bursaphelenchus mucronatus orthomyxo-like virus 1 (BMOV1) (Table 1 and Fig. 5). Both PCNOV1 and BMOV1 are composed of five genomic segments, encoding polymerase basic protein 1 (PB1), polymerase basic protein 2 (PB2), polymerase acidic protein (PA), putative glycoprotein (GP), and nucleoprotein (NP) (Fig. 5). PCNOV1 is abundant in seven libraries from two different lineages of G. pallida and four different lineages of G. rostochiensis; BMOV1 is abundant in eight libraries from three different lineages of pinewood nematodes (Bursaphelenchus mucronatus) (Supplementary Fig. 4A). Correlation analysis of segment abundance further indicated that these segments are associated with PCNOV1 and BMOV1 (Supplementary Fig. 4B). The PB1 proteins of PCNOV1 and BMOV1 show the highest similarity (approximately 80% coverage and 40% identity) to the PB1 protein of Wenling orthomyxo-like virus 2 (AVM87619.1), an orthomyxo-like virus infecting red spikefish, Triacanthodes anomalus30. The PA proteins of PCNOV1 and BMOV1 share low similarity with their counterparts from Thailand tick thogotovirus (49% coverage and 27% identity) and Water boatmen thogotovirus 1 (34% coverage and 23% identity), respectively. The PB2, GP, and NP proteins of PCNOV1 and BMOV1 show no detectable sequence similarity to known orthomyxoviral proteins in BLASTP searches. However, HHpred search with the aligned PB2 and NP identified Influenza C virus protein as significant match, with probabilities exceeding 95% and E-values below 3.7e-05 (Supplementary Fig. 5). In contrast, the aligned GP sequences of PCNOV1 and BMOV1 show similarity to those of mononegavirals (e.g., Borna disease virus), with probabilities above 99% and E-value lower than 1.5e-23, than to the hemagglutinins of known articulavirals (Supplementary Fig. 5). Phylogenetic analysis of PB1 shows that PCNOV1 and BMOV1 form a well-supported clade (94% UFBoot support) with some insect/fish-associated orthomyxo-like viruses, which represents a sister clade of thogotovirus in Orthomyxoviridae (Fig. 5).

Fig. 5: Orthomyxo-like viruses in PPNs.
figure 5

A Phylogenetic tree based on the main replicase PB1 of Articulavirales. Q.pfam+F + I + R5 was selected as the best-fit model. B Genome organization of BMOV1 and PCNOV1. Read coverage profiles, generated using Bowtie 2 (version 2.5.4), were visualized alongside the viral genome. Coverage value ranges are indicated in square brackets. Note that coverage across the 5′ region of BMOV1-PB1 is low and is therefore displayed on a logarithmic scale.

Among the SRA runs where PCNOV1 was detected, PCNOV1 is most abundant in three SRA runs, SRR16693885, SRR7167829, and ERR1173512, where the contaminant reads are below 1%. For BMOV1, the contaminant reads in SRR7062725 and SRR7062722 are below 2% (Supplementary Note 1). These analyses suggest Globodera spp. and B. mucronatus as the probable hosts for PCNOV1 and BMOV1, respectively.

(iv) Goujianvirales. (a) Yueviridae and (b) Qinviridae

We identified six yue-like viruses: potato rot nematode yue-like virus 1 (PRNYV1), potato rot nematode yue-like virus 2 (PRNYV2), soybean cyst nematode yue-like viruses 1a (SCNYV1a), soybean cyst nematode yue-like viruses 1b (SCNYV1b), soybean cyst nematode yue-like viruses 2 (SCNYV2), and soybean cyst nematode yue-like viruses 3 (SCNYV3) (Table 1 and Fig. 6A). Except for PRNYV2, which contains three segments, all five other yue-like viruses comprise two genome segments. The large segment (RNA1) of all PPN-associated yue-like viruses encodes a protein with an RdRP domain. The small segment (RNA2) encodes a hypothetical protein. The RNA2 and RNA3 of PRNYV2 share partial nucleotide similarity, especially at the 5′ and 3′ terminal regions (Fig. 6A), and their encoded proteins share 43% aa identity. Furthermore, while the RNA2 of PRNYV1 shows conservation with known yueviruses in Yueviridae, the RNA2 of PRNYV2, SCNYV1a, SCNYV1b, SCNYV2, and SCNYV3 are longer and exhibit no significant sequence similarity or homology to the former. In the RNA1-based phylogenetic tree, PRNYV1 clusters within the established Yueviridae family, while the remaining five yue-like viruses—PRNYV2, SCNYV1a, SCNYV1b, SCNYV2, and SCNYV3—form a distinct monophyletic clade. Within this clade, PRNYV2 groups with Thrips tabaci-associated yue-like virus 1 (Ttayue1)31, whereas the four SCNYVs (SCNYV1a, SCNYV1b, SCNYV2, and SCNYV3) comprise a separate, well-supported subclade (Fig. 6A).

Fig. 6: Qin- and yue-like viruses.
figure 6

A Phylogenetic tree based on the RdRP of qin-like and yue-like viruses is shown at the left panel. The genome organization of qin-like and yue-like viruses is shown at the right panel. In the midpoint-rooted phylogenetic tree, all nematode-associated viruses identified in this study are highlighted in red, and previously reported nematode-associated viruses are followed by a nematode illustration. For the four yue-like viruses with introns, the exons are represented by gray boxes. Q.pfam+F + R5 was selected as the best-fit model. B Sashimi plot, incorporating read coverage profiles, displays reads spanning the predicted intron in four intron-bearing yue-like viruses. The number of reads supporting each splice junction is indicated on the corresponding arc. C The conserved nucleotide motif of predicted splice sites identified in four intron-bearing yue-like viruses.

Unexpectedly, the coding regions of RNA1 of the four SCNYVs, as well as RNA2 of SCNYV2 and SCNYV3, are interrupted by 1–10 introns containing canonical RNA splice site motifs (GU/AG) (Fig. 6B–C and Supplementary Fig. 6). Furthermore, all SRA runs in which the four SCNYVs were sufficiently covered exhibited clear evidence supporting the presence of these introns (Supplementary Fig. 7). However, no introns were detected in the genomes of Ttayue1 and PRNYV2, despite their close phylogenetic relationship with these intron-bearing yue-like viruses (Supplementary Fig. 6A and Supplementary Fig. 8). Interestingly, although the RNA1 of SCNYV1a and SCNYV1b share 97% aa identity (100% coverage) and 88% nucleotide identity (98% coverage), their intron distributions differ, with only two introns in RNA1 of SCNYV1b but ten in SCNYV1a. To investigate whether the four SCNYVs originated from DNA sequences, such as endogenous viral elements or retro-like viral elements, we performed a BLASTN search using default parameters against the Nucleotide collection (nr/nt), High Throughput Genomic Sequences (HTGS), and Whole Genome Shotgun contigs (WGS) databases, with the latter two restricted to sequences of Heterodera spp. No significant hits were obtained. This result suggests that SCNYV1a, SCNYV1b, SCNYV2, and SCNYV3 are likely a unique group of RNA viruses harboring multiple introns within their genomes, a feature rarely observed in RNA viruses.

We identified two bi-segmented qin-like viruses: potato rot nematode qin-like virus 1 (PRNQV1) in the PRN virome, and potato cyst nematode qin-like virus 1 (PCNQV1) in four transcriptome libraries of G. rostochiensis derived from three different biosamples in NCBI. Based on phylogenetic analysis of RdRP encoded by RNA1, the two PPN qin-like viruses cluster with Xinzhou nematode virus 3 discovered from snake-associated nematodes, and they form a nematode-infecting virus clade in Qinviridae (Fig. 6).

Based on the species composition of the SRA runs associated with the highly abundant viruses—including PCNQV1 (ERR1173512), SCNYV1a (SRR6232813 and SRR6232824), SCNYV1b (SRR9647096), SCNYV2 (SRR6230580), and SCNYV3 (SRR6230583)—the hosts of all identified yue-like and qin-like viruses are likely the respective plant-parasitic nematodes (PPNs).

Positive-sense single-stranded RNA viruses

(i) Martellivirales. (a) Endornaviridae

We identified three endornaviruses from RNA-seq data of field-collected PRN (Table 1). They share 90–92% nucleotide identity and 95–96% protein identity with each other, and were temporarily named potato rot nematode endornavirus 1a (PRNEV1a), PRNEV1b, and PRNEV1c, representing different variants of a new endornavirus (PRNEV1). PRNEV1 consists of a large ORF encoding a putative polyprotein with the conserved domains of helicase and RdRP (Supplementary Fig. 9). The RdRP phylogenetic tree placed PRNEV1 in the Alphaendornavirus genus of Endornaviridae (Supplementary Fig. 9). Whether the host of PRNEV1 is indeed PRN remains uncertain, as approximately 1% of the reads in SRR28892574 are attributed to ascomycete fungi which are generally the hosts of endornaviruses (Supplementary Note 1).

(b) Mycoalphaviridae

We identified two mycoalphaviruses: soybean cyst nematode-associated alpha-like virus 1 (SCNAlphaV1) and cereal cyst nematode-associated alpha-like virus 1 (CCNAlphaV1), belonging to Mycoalphaviridae, which comprises of fungal alpha-like viruses (Supplementary Table 2). Phylogenetic analysis shows that SCNAlphaV1 clusters with an alpha-like virus isolated from the fungus Leptosphaeria biglobosa (incorrectly annotated as Leptosphaeria biglobosa flavi-like virus 1 in GenBank) (Supplementary Fig. 9). CCNAlphaV1 belongs to the same virus species as Solanum melongena bastro-like virus and Plasmopara viticola lesion-associated alpha-like virus 1. In the SRA run (SRR5588565) where CCNAlphaV1 was identified, nearly 0.8% of the total reads were classified as Oomycota, and 7% were unclassified. This suggests that CCNAlphaV1 is hosted by contaminated oomycetes or other organisms. Notably, SCNAlphaV1 was identified at relative abundances of 0.01% and 0.04% in the total reads from two SRA runs, SRR9647096 (egg stage) and SRR9647098 (second-stage larva), respectively. Fungal contamination is minimal, comprising only 0.07% of the total reads from these two SRA runs. Furthermore, two previously reported soil-associated nematode viruses, Maryland hepe-like virus 10 and Maryland martelli-like virus 1127, are positioned in Mycoalphaviridae (Supplementary Fig. 9). These results imply the close relationships between the nematodes and some fungal mycoalphaviruses.

(c) Unclassified Martellivirales

We identified three new viruses belonging to unclassified taxa within Martellivirales, comprising sugar beet cyst nematode-associated virga-like virus 1 (SBCNVV1), soybean cyst nematode-associated virga-like virus 1 (SCNVV1), and Bursaphelenchus mucronatus-associated virga-like virus (BMVV1) (Supplementary Fig. 9 and Table 1). The genome of BMVV1 is 9,651 nt in length and is predicted to encode a single large ORF exceeding 3000 aa, featuring an Mtr-S1H-RdRP domain arrangement but lacking any ORF encoding capsid proteins. Regarding SBCNVV1 and SCNVV1, although the assembled genome of SBCNVV1 contains two gap regions, the RdRP domains of these two viruses share approximately 87% aa identity. Additionally, a methyltransferase domain (FtsJ-Mtr), which is usually found in the order Flasuviricetes, was identified upstream of the S1H domain in SCNVV1. The second ORF of SBCNVV1 and SCNVV1 encodes a putative capsid protein, sharing low similarity (82% coverage and 24% identity) to a hypothetical protein of animal-infecting nematodes (Trichinella spp.). SBCNVV1 and SCNVV1 share a close evolutionary relationship but cluster unreliably with other known virga-like viruses. In contrast, BMVV1 forms a well-supported clade (UFboot support value > 98%) with the plant-associated tobamo-like virus 1 (PaToLV1) and two animal nematode viruses: Brugia malayi RNA virus 1 (BmRV1) and Haemonchus contortus RNA virus 2. However, BMVV1 shares low sequence similarity (27% coverage and 37% aa identity) with PaToLV1 (Supplementary Fig. 9 and Table 1). The genome of BMVV1 (9.65 kb), which is longer than those of PaToLV1 (6.55 kb) and BmRV1 (7.83 kb), differs markedly from these viruses by encoding only a large polyprotein28,32. These differences suggest that BMVV1 is distinct from PaToLV1 at least at the genus level.

In the SRA runs where SCNVV1 was detected, the reads of SCNVV1 constitute 0.1% and 0.04% of the total reads in SRR9647096 and SRR9647098, respectively, with contaminant reads representing less than 1% of the total reads in both runs (Supplementary Note 1). Based on these analyses, we hypothesized that the host of SCNVV1 is likely SCN. Given the close phylogenetic relationships between SBCNVV1 and SCNVV1, SBCNVV1 may be hosted by SBCN. Among the SRA runs where BMVV1 was detected, the reads of BMVV1 are most abundant in SRR7062723 and account for 0.002% of the total reads. In this library, 80% of the reads are classified as B. mucronatus, 13% as Bacteria, and 4% remain unclassified. Additionally, 0.8% of the reads are associated with plants belonging to the Magnoliopsida, including species of Trifolium, Medicago, and Citrus. Furthermore, BMVV1 was only detected in SRA runs derived from egg and L2 stages, but not in L3 & L4 or adult stages (Supplementary Note 1). Based on these results, we cannot exclude the possibility that BMVV1 may be hosted by plants, similar to cycas necrotic stunt virus, which was also identified in the SRR7062723 run.

(ii) Tymovirales

We identified eleven tymovirals from PPN-associated SRA datasets, including eight new viruses: potato cyst nematode-associated tymovirus 1 (PCNTV1), soybean cyst nematode-associated gammaflexivirus (SCNGFV1), three soybean cyst nematode-associated deltaflexiviruses (SCNDFV1–3), and three pinewood nematode-associated deltaflexiviruses (PWNDFV1–3) (Table 1 and Supplementary Table 2). The remaining three included two characterized plant viruses—potato virus S (PotVS) and Citrus yellow vein clearing virus (CYVCV)—along with Agrostis stolonifera deltaflexivirus 1 (AsDFV1), which has been reported in plants and fungi, respectively. PotVS was found in two SRA runs (ERR202422 and SRR3162514), where no plant-derived reads were detected, implying that potato cyst nematodes may serve as vectors for PotVS transmission. Phylogenetic analysis reveals that PCNTV1 clusters with plant viruses within Tymoviridae, and that SCNGFV1, SCNDFV1–3 and PWNDFV1–3 cluster with mycoviruses in Gammaflexiviridae and Deltaflexiviridae, respectively (Supplementary Fig. 10). In the ERR202430 run, where PCNTV1 is more abundant compared to other SRA runs, no plant-derived reads were identified, except perhaps in the 3% unclassified reads (Supplementary Note 1). This suggests that PCNTV1 may infect or be transmitted by PCNs. As for SCNGFV1, SCNDFV1–3 and PWNDFV1–3, whether SCNs serve as hosts for these remains undetermined.

(iii) Amarillovirales. (a) Flaviviridae

Eight flaviviruses were discovered in PPN-associated public RNA-seq datasets. Among these, one was soybean cyst nematode virus 5 (a previously reported large genome flavivirus), while the remaining seven were new Jingmen-like viruses, including sugar beet cyst nematode Jingmen virus 1 (SBCNJMV1), two potato cyst nematode Jingmen viruses (PCNJMV1–2), and four cereal cyst nematode Jingmen viruses (CCNJMV1–4) (Table 1 and Fig. 7). Jingmen viruses are characterized by their segmented genomes, which were thought to have evolved from an ancestral, unsegmented flavivirus33. For the seven newly identified Jingmen viruses, two RNA segments are well-conserved among all known Jingmen viruses: one encoding the non-structural protein 5 (NS5) with a methyltransferase and RdRP domain, and another encoding NS3 proteins with a serine protease and DEAD-like helicase domain. The third segment encodes a putative glycoprotein that likely functions as a viral envelope protein, characterized by a cysteine-rich domain and multiple transmembrane domains (Supplementary Fig. 11). The fourth segment encodes a protein of approximately 400 aa, which includes a conserved motif and features either a signal peptide or a transmembrane domain at its N-terminus, suggesting that the protein may function as a capsid protein (Fig. 7 and Supplementary Fig. 12). However, the presence of multiple Jingmen virus infections (CCNJMV1–4) within SRR5942326 run complicates definitive assignment of genome segments to the corresponding viruses. Based on transcript abundance and phylogenetic analysis, CCNJMV2 is the most abundant, followed by CCNJMV3, which clusters closely with CCNJMV4 (Supplementary Fig. 13). Furthermore, seven distinct putative capsid segments were identified in SRR5942326, adding further complexity to the assignment of these segments to a specific CCNJMV. Ultimately, only three RNA segments could be confidently assigned to CCNJMV1–4.

Fig. 7: Nematode-associated viruses in Flaviviridae.
figure 7

The genome diagram of PPN-associated flavivirus and phylogenetic tree of the members of Flaviviridae based on the RdRP domain in NS5 protein. In the midpoint-rooted phylogenetic tree, all nematode-associated viruses identified in this study are highlighted in red, and previously reported nematode-associated viruses are followed by a nematode illustration. Q.pfam+F + R7 was selected as the best-fit model for phylogenetic analysis. SP, signal peptide; TM, transmembrane domain; Pro, protease; S2H, superfamily 2 helicase; E, putative envelope protein (glycoprotein); MTase, methyltransferase; UnCP, unassigned capsid segment; FS, putative ribosomal frameshift; LGF, large-genome flavivirus.

The NS5 and NS3 proteins of PPN-infecting Jingmen viruses exhibit 26–40% aa identity with previously identified Jingmen viruses. In contrast, glycoproteins and putative capsid proteins of these PPN-infecting Jingmen viruses show no detectable similarity to other known Jingmen viruses (Table 1). A phylogenetic tree of the RdRP domain in the NS5 segment reveals that the seven PPN-infecting Jingmen viruses form a single clade with three putative Jingmen viruses associated with animal-parasitic nematodes: Cooperia oncophora, Nippostrongylus brasiliensis, and Toxocara canis. This phylogenetic evidence supports the classification of this nematode-associated Jingmen virus clade as a novel third lineage closely related to the canonical tick-borne Jingmen tick viruses.

Based on the taxonomic classification of reads from the SRA runs of PCNJMV1 (ERR1173511 and ERR1173512), PCNJMV2 (ERR202480), and SBCNJMV1 (SRR16675965 and SRR16675966), the low proportion of unclassified reads and the absence of contaminants corresponding to known hosts of Jingmen viruses suggest that PPNs likely serve as the hosts of these PPN-specific Jingmen viruses (Supplementary Note 1). Although CCNJMV1–4 were only identified in the SRR5942326 run, which contains a large proportion (38%) of unclassified reads (Supplementary Note 1), phylogenetic analysis supports that these four viruses may be hosted by cereal cyst nematodes or other nematodes.

(iv) Tolivirales. (a) Tombusviridae

We identified two new tombus-like viruses, potato cyst nematode associated tombus-like virus 1 (PCNTomV1) and soybean cyst nematode associated tombus-like virus 1 (SCNTomV1). Phylogenetic analysis shows that PCNTomV1 and SCNTomV1 are placed in two different unclassified clades relative to Tombusviridae (Supplementary Fig. 14). SCNTomV1 accounts for approximately 0.03% of the reads in the SRR6269844 run, which include contaminants such as Bacteria (4%), Phytophthora spp. (0.9%), Streptophyta (0.4%), and Sporadotrichida (0.3%) (Supplementary Note 1). PCNTomV1 accounts for 0.001% of the reads in the SRR1873823 run, which contain contaminants including Mammalia (8%), unclassified reads (7%) and Acanthamoeba spp. (0.4%) (Supplementary Note 1). Based on the taxonomic classification of reads and phylogenetic analysis, whether SCNTomV1 and PCNTomV1 are hosted by PPNs remains unknown.

(v) Nodamuvirales. (a) Nodaviridae

We identified two nodaviruses with bi-segmented RNA genomes in nematode populations isolated from rotting sweet potato. One shares 90–92% identity and over 98% query coverage with RNA1 and RNA2 of Santeuil virus found in C. briggsae, and was considered a new variant of the Santeuil virus, hence named Santeuil virus isolate hongshu (Table 1 and Supplementary Fig 15A). The second nodavirus, named Lulong nodavirus, has a bi-segmented RNA genome of 3452 and 2476 nt in length, respectively. RNA1 was predicted to encode two ORFs: the large one containing a methyltransferase domain and an RdRP domain which shares 32.15% identity with mantis virus (Sanya nodavirus 1, MZ209970.1); and the small one with no predicted domain, located near the 3′ terminal of RNA1, and may be translated from a subgenomic RNA as reported in Alpha- and Betanodavirus. RNA2 of Lulong nodavirus contains two ORFs: ORF1 encoding capsid proteins sharing homology to Capsid-VNN (viral nervous necrosis) of betanodaviruses; and ORF2 predicted to be translated via a ribosomal frameshifting strategy to express a fusion protein that merges with the capsid of ORF1 (Table 1 and Supplementary Fig. 15A–B). This is similar to the strategy used by the capsid-delta protein encoded by Santeuil virus, but the hypothetical protein of ORF2 does not show homology to the delta protein. In the phylogenetic tree of the RdRP domain of nodaviruses, Lulong nodavirus is grouped with unclassified nodaviruses with low UFBoot support (53%) (Supplementary Fig. 15C). However, the Capsid-VNN domain of Lulong nodavirus seems to be a sister clade of nematode-associated viruses, but with low bootstrap support (Supplementary Fig. 15D). Considering the uncertainty in phylogeny and the low abundance of Lulong nodavirus, we speculate that nematodes may not be their natural hosts, but rather other contaminating organisms in the SRR28892574 run, as mentioned previously. Additionally, due to the presence of approximately 2% reads assigned to C. briggsae, it is difficult to ascertain whether Santeuil virus isolate hongshu is from C. briggsae or D. destructor. Unexpectedly, we found RNA2 of Santeuil virus may be present in a small amount in the circular form, which was confirmed by RT-PCR and read mapping (Supplementary Fig. 16). However, whether the circularity is derived from template-switching or mis-priming during RT-PCR remains to be further determined.

(vi) Picornavirales. (a) Marnaviridae

Marnavirids typically have mono- or dicistronic genome organizations, with non-structural proteins preceding the structural proteins34. Our study identified seven new viruses within the Marnaviridae family: rice root-knot nematode associated marnavirus 1 (RRKNMV1), cereal cyst nematode associated marnavirus 1 (CCNMV1), root-knot nematode marnavirus 1 (RKNMV1), two variants of root-knot nematode associated marnavirus 2 (RKNMV2), and potato cyst nematode marnavirus 1–2 (PCNMV1–2) (Table 1). Only a partial genome of PCNMV2, containing regions encoding the capsid protein, was assembled. RKNMV1 and CCNMV1 are monocistronic, while the other four nematode associated marnaviruses are dicistronic, separated by intergenic regions, or they utilize stop-codon readthrough or program frameshift strategies. Phylogenetically, all six nematode-associated marnaviruses are scattered on a branch containing Locarnavirus within the Marnaviridae family (Fig. 8A and Supplementary Fig. 17).

Fig. 8: Phylogenetic tree and genome organization of PPN-associated viruses belonging to Picornavirales.
figure 8

A Phylogenetic tree based on the RdRP domain of PPN-associated viruses and their relatives in Picornavirales. Q.pfam+F + I + R9 was selected as the best-fit model according to the BIC score. The detailed tree is in Supplementary Fig. 17. BD Genome organization of PPN-associated viruses. Hel, RNA helicase. 3Cpro, cysteine protease. RdRP, RNA-dependant RNA polymerase. CRPV, capsid domain (derived from cricket paralysis virus). Rhv, capsid domain (derived from rhinovirus). Ns, site with gaps. RT, stop-codon readthrough. E Enlarged view of the unclassified clade containing nematode specific picorna-like viruses in Fig. 8A. Animal and soil nematode-associated viruses are followed by a nematode illustration. Newly identified viruses in this study are marked with solid red star.

Most well-known hosts of marnaviruses are organisms in Stramenopiles. Because most marnavirids were discovered in diverse water or water-associated viromic studies, the hosts of these marnavirids remain uncertain. Among the SRA runs in which the six newly identified nematode-associated marnaviruses are abundant, most contain many unclassified reads. The hosts of these marnaviruses discovered in PPN-associated SRA runs remain uncertain (see Supplementary Note 1).

(b) Dicistroviridae

We identified nine dicistroviruses or dicistro-like viruses, including eight new ones: root-knot nematode-associated dicistro-like virus 1–6 (RKNDicV1–6), potato cyst nematode-associated dicistro-like virus 1–2 (PCNDicV1–2), and rice root-knot nematode-associated dicistrovirus 1 (RRKNDicV1) (Table 1 and Supplementary Table 2). RRKNDicV1 exhibits 100% coverage and over 96% identity to Dicistroviridae sp. and an aphid virus—Rhopalosiphum padi virus (Supplementary Table 2). Phylogenetic analysis shows that RKNDicV1–6 form a distinct cluster with a clade of unclassified dicistro-like viruses, while PCNDicV1 and PCNDicV2 are located in two separate clusters, each grouping with different unclassified dicistro-like viruses (Fig. 8A and Supplementary Fig. 17).

(c) Unclassified Picornavirales

We identified eight picorna-like viruses, including five new ones: potato cyst nematode picorna-like virus 2 (PCNPV2), three variants of cereal cyst nematode picorna-like virus 1 (CCNPV1a–c), and cereal cyst nematode picorna-like virus 2 (CCNPV2); as well as three previously reported ones: potato cyst nematode picorna-like virus (PCNPV)23, sugar beet cyst nematode virus 1 (SBCNV1)22, and root lesion nematode virus (RLNV)24 (Table 1). PCNPV2 shares 82% nucleotide identity and 92% aa identity with the previously reported PCNPV. Phylogenetic analysis based on the RdRP domain shows that PCNPV2, together with all previously identified PPN-infecting picorna-like viruses, forms a monophyletic clade with over 99% UFboot support. The three variants of CCNPV1 share 88–90% nucleotide sequence identity with each other and 41% nucleotide identity with CCNPV2. Both CCNPV1 and CCNPV2 are positioned at the base of PPN-specific picorna-like virus clades (Fig. 8D). Although CCNPV1 and CCNPV2 were only identified in the SRR5942326 run, where 38% of the reads remain unclassified, their close evolutionary relationship to known PPN-infecting picorna-like viruses suggests that their hosts are likely CCNs.

(vii) Stellavirales. (a) Astroviridae

We identified three new astro-like viruses: soybean cyst nematode-associated astro-like virus 1 (SCNAstV1), potato cyst nematode astro-like virus 1 (PCNAstV1), and potato cyst nematode astro-like virus 2 (PCNAstV2). Based on phylogenetic analysis, SCNAstV1 clusters with a Ripishyf virus isolated from riverbank sediment, while PCNAstV1 and PCNAstV2 each form a distinct clade with astro-like viruses, albeit with low UFboot support values (67% and 71%, respectively) (Supplementary Fig. 18). Whether PPNs are the hosts for SCNAstV1, PCNAstV1, and PCNAstV2 remains undetermined (Supplementary Note 1).

(viii) Bormycovirales

We identified twelve new ormycoviruses in five types of PPN datasets, including rice root-knot nematode, potato rot nematode, soybean cyst nematode, potato cyst nematode, and reniform nematode. These viruses were designated rice root-knot nematode-associated ormycovirus 1 (RRKNOrmV1), potato rot nematode-associated ormycovirus 1–4 (PRNOrm1–4), soybean cyst nematode-associated ormycovirus 1–4 (SCNOrmV1–4), potato cyst nematode-associated ormycovirus 1 (PCNOrmV1), and reniform nematode-associated ormycovirus 1–2 (RNOrmV1–2) (Table 1). Phylogenetic analysis based on RNA1 reveals six distinct evolutionary relationships among these viruses. SCNormV4 clusters within the Deltaormycoviridae family, while SCNOrmV2, PCNOrmV1, and PRNOrmV2 are positioned within the Alphaormycoviridae family. The phylogenetic placement of RRKNOrmV1 and PRNOrmV4 remains uncertain due to low UFboot support (<90%). RNOrmV1–2 and SCNOrmV1 form a clade, whereas PRNOrmV1 and SCNOrmV3 cluster with downy mildew lesion-associated ormycovirus 6 (Fig. 9).

Fig. 9: Phylogenetic tree and genome organization of PPN-associated viruses in Bormycovirales.
figure 9

Phylogenetic tree based on the RdRP domain in RNA1 of viruses in Bormycovirales. Q.pfam+F + I + R5 was selected as the best-fit model according to BIC score. RT, stop-codon readthrough.

Given that these PPN-associated ormycoviruses are distributed among diverse clades of known fungal or oomycete-associated ormycoviruses, it is possible that some of the identified ormycoviruses originate from contaminants in the library. To address this, we thoroughly scrutinized the taxonomic composition of the libraries, as detailed in the Supplementary Note 1. Our analysis suggests that PRNOrmV1, RNOrmV1, and SCNOrmV1 are highly likely to be associated with their corresponding nematode hosts.

Novel double-stranded RNA viruses

(i) Birnaviridae

Birnaviridae contains bi-segmented dsRNA viruses infecting both vertebrates and invertebrates and displaying a permuted motif (C-A-B motif) in the RdRP domain. We discovered a new birnavirus, tentatively named potato rot nematode associated birnavirus 1 (PRNBiV1), isolated from potato rot nematode populations collected in the field. Segment A of PRNBiV1 contains a large ORF encoding a polyprotein consisting of the capsid protein precursor VP2, endopeptidase VP4, and ribonucleoprotein VP3. Segment B encodes VP1 with the RdRP domain at the N terminus (Supplementary Fig. 19). A BLASTP search shows that VP1 of PRNBiV1 has the strongest match with Lates calcarifer birnavirus (YP_010086266.1), exhibiting 36.4% identity and 77% query coverage. In the VP1-based phylogenetic tree, PRNBiV1 is positioned at the base of all established genera within the family Birnaviridae (Supplementary Fig. 19). The host of PRNBiV1 is likely PRNs, as described in the Supplementary Note 1.

(ii) Durnavirales. (a) Amalgaviridae

We identified two new amalga-like viruses, potato rot nematode-associated amalga-like virus 1 (PRNAmaV1) and soybean cyst nematode-associated amalga-like virus 1 (SCNAmaV1). Phylogenetic analysis shows that PRNAmaV1 and SCNAmaV1 are closely related and cluster with Physcomitrium patens amalgavirus 1 (Supplementary Fig. 20). The hosts of PRNAmaV1 and SCNAmaV1 are likely the plant species co-contaminated in PPN-associated libraries.

Discussion

Through analysis of public SRA data from 25 PPN species and RNA-seq data from field PRN populations, we identified 94 PPN-associated RNA viruses—representing a seven-fold increase over the previously known 12 PPN-viruses. Our findings confirm that PPN-specific viruses form three distinct clades among Rhabdoviridae, Nyamiviridae, and Picornavirales, consistent with earlier research. Notably, we report the first discovery of viruses from the Jingmenvirus group, Orthomyxoviridae, Yueviridae, and Bormycovirales in PPNs, providing new insights into the evolution of these viral lineages. The substantial proportion of viruses identified in field-collected D. destructor populations suggests that PPN viral diversity in natural environments likely exceeds current observations.

While viruses in animal-parasitic nematodes and soil nematodes have been previously characterized17,18,27,28, their evolutionary relationships with PPN-associated viruses remained unclear. Our study reveals several evolutionary connections: (1) Three unclassified PPN-associated martellivirals, including BMVV1 which clusters with Brugia malayi RNA virus 1 (from human-parasitic nematodes) in Martellivirales; (2) PRNQV1 and PCNQV1 grouping with snake-associated Xinzhou nematode virus 3 in Qinviridae; and (3) Seven PPN Jingmen viruses forming a clade with those from animal-parasitic nematodes (C. oncophora, N. brasiliensis28, and T. canis33). These findings suggest two possible evolutionary scenarios: ancestral infection prior to nematode diversification, or horizontal virus transmission among animal-parasitic, plant-parasitic nematodes, and other organisms.

We identified two novel orthomyxo-like viruses (PCNOV1 and BMOV1) from Globodera spp. and B. mucronatus, respectively. These viruses possess at least five genomic segments encoding PB1, PB2, GP, PA, and NP, with potential additional segments undetected due to low sequence similarity. Based on viral abundance and SRA taxonomic composition, we propose these nematodes as likely true hosts, thereby expanding the known host range of Orthomyxoviridae35. Phylogenetically, these viruses form a distinct clade with Bombus-associated virus Orth1, Wenling orthomyxo-like virus 230, and Guangxi sediment orthomyxo-like virus36—suggesting these may also originate from overlooked nematodes in their respective samples.

RNA splicing, a crucial post-transcriptional process for mRNA maturation, is rarely observed in RNA viruses, with known examples limited to influenza viruses37, Culex tritaeniorhynchus rhabdovirus38, and Borna disease virus39. We discovered a novel clade of SCN yue-like viruses (CNYV1a, SCNYV1b, SCNYV2, and SCNYV3) harboring 2–10 introns in their RNA1 and RNA2 segments. The differential intron distribution patterns between RNA1 of SCNYV1a and SCNYV1b suggest rapid evolutionary dynamics. The presence of GU/AU motif at the splice sites suggests that these viruses may utilize the host’s RNA splicing machinery for viral protein production (e.g., RdRP).

A key limitation of this study lies in the uncertainty of host assignment for some novel viruses. Defining the hosts of RNA viruses identified from metatranscriptomic data or public SRA RNA-seq data with mixed organisms remains challenging40. Although host inference based on sequence similarity and phylogenetic analysis is feasible for characterized viruses, the scarcity of nematode virus-host reference data and the low similarity of most identified viruses complicate confident host determination41. As indicated by taxonomic analysis and previous literature10,42, contaminated organisms (e.g., plants, fungi, and bacteria) sometimes co-occurs with PPNs, highlighting the necessity of systematic taxonomic/contamination analysis of the SRA runs. Additionally, some viruses may be transmitted by PPNs43,44. For instance, we detected PotVS in two Globodera sp., as well as Citrus yellow vein clearing virus and Duck astrovirus in Bursaphelenchus xylophilus, suggesting these PPNs might play a role in the transmission of these viruses. Moreover, to confirm the host specificity of the PPN-associated viruses identified in this study, further experimental validation, such as in situ hybridization and infectivity assays, will be necessary in future work.

Approximately one-third (25/94) of the newly identified PPN viruses or PPN-associated viruses in this study were detected in field-collected PRN populations. Most novel bunyaviruses identified in field samples form distinct clades from known viruses (SCNUUKV and SCNRSV), suggesting PPN viral diversity is substantially underestimated, as previous studies were largely confined to laboratory-maintained PPN populations. This finding is consistent with a soil nematode virome study, which has reported that up to 93% (139/150) novel viruses are novel, although some may infect other soil-inhabiting organisms27. However, several limitations are present in both field-collected PRN samples and the public RNA-seq data associated with PPNs. Currently available PPN-related RNA-seq data in SRA databases are heavily biased and homogenized, with many data originating from a few species such as B. xylophilus and G. pallida. This imbalance undermines meaningful comparative analysis of viral diversity across different nematode species. Although our limited field sampling led to the discovery of numerous previously unknown PRN-associated RNA viruses, broader spatiotemporal sampling and more sequencing efforts will be necessary to fully elucidate the virome of plant-parasitic nematodes.

In summary, our study highlights gaps in understanding virus infecting the most abundant organisms on Earth. Our findings contribute to explore molecular interactions between viruses and PPNs. Expanded investigation of viral diversity will further illuminate evolutionary and ecological dynamics in PPN-virus systems.

Methods

Potato rot nematode sample collection

For virus mining, 11 populations of potato rot nematode were used, including 10 field populations collected from diseased sweet potatoes in four distinct locations in Lulong County, Qinhuangdao City, Hebei Province, China (Supplementary Fig. 1 and Supplementary Table 5), as well as one laboratory-cultured population. For each field population, nematodes were purified via sucrose centrifugation. First, diseased sweet potatoes were cut into pieces and placed into 90-mm-diameter glass dishes. Then, approximately 80 mL of ddH2O was added to cover the sweet potato pieces. After soaking for 3–4 hours, the pieces were removed, and 0.8 mL of 1% Triton-X 100 solution was added to each dish to prevent the nematodes from adhering to the glass walls. The solution was poured into a 100 mL centrifuge tube and centrifuged at 4000 × g for 5 minutes. The resulting nematode pellet was resuspended in ddH2O and transferred to a 1.5 mL centrifuge tube. Following another centrifugation at 4000 × g for 5 minutes, the supernatant was discarded, and the pellet was resuspended in 1 mL of 35% sucrose solution. This mixture was then centrifuged at 2500 × g for 3 minutes. Immediately after this last centrifugation, the supernatant containing the nematodes was transferred into 1000 mL of distilled water to mitigate osmotic pressure-induced mortality. Subsequently, the diluted nematode solution was filtered through a 3-μm membrane. The nematodes retained on the membrane were washed off using ddH2O. The purified nematodes were washed twice with ddH2O, flash-frozen in liquid nitrogen, and stored at ‒80 °C until RNA extraction was performed.

Library preparation and sequencing

Total RNA was extracted from frozen potato rot nematodes using the TransZol Up Plus RNA Kit (TransGen Biotech, China) following the manufacturer’s instructions. Total RNA was quantified using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, USA) and stored at ‒80 °C until further processing. To enrich for viral sequences, particularly those lacking poly(A) tails, ribosomal RNA was depleted from the RNA samples using the Globin-Zero Gold rRNA Removal Kit (Illumina, CA, USA). Sequencing libraries were prepared following the manufacturer’s protocols. Paired-end sequencing (150 bp) was conducted on an Illumina HiSeq 4000 platform (Illumina, CA, USA). The raw data has been deposited under the NCBI BioProject PRJNA1107688.

Identification of viral sequences

To further investigate the diversity of RNA viruses associated with PPNs, we conducted virus mining of the publicly available transcriptome data from the Sequence Read Archive (SRA) database. Initially, the species names of 25 PPNs were used as keywords to search against the SRA database, and only RNA-seq data were selected for analysis. A total of 536 SRA runs (up to March 30, 2022) were downloaded using the prefetch tool and converted to FASTQ format using the fastq-dump tool. Low-quality reads were removed using the Trimmomatic program (version 0.36) with default parameter settings45. The clean reads were then assembled de novo using MEGAHIT (v1.2.9)46. Contigs generated from the same nematode species were combined and translated into protein sequences using TransDecoder.LongOrfs with the parameters “-m 100 -G Universal” in TransDecoder (v5.7.1). The resulting protein sequences were annotated using MMseqs2 (version 13.45111) against the clustered NR database. Viral contigs larger than 1 kb were filtered and manually verified. To ensure the absence of host contamination, the viral contigs were aligned against the NT database using megablast within BLAST program. The library information and identified viruses of all SRA runs can be found in Supplementary Table 3.

Viral genome extension and annotation

To recover longer viral genomes, viral contigs sharing the same match in the previous MMseq2 annotation step were searched against all assembled contigs in the same SRA runs using megablast within the BLAST program, and all the matches were further assembled using the cap3 program47. For cloning of full-length viral sequences, a PC3-T7 loop adapter was ligated to the RNA termini using T4 RNA ligase (TaKaRa) at 16 °C. The ligated products were reverse-transcribed into cDNA using M-MLV reverse transcriptase, and the terminal regions were amplified using Taq DNA polymerase (Yeasen Biotechnology Co., Ltd.) with the PC2 primer and virus-specific primers. The resulting PCR products were purified, cloned, and subjected to Sanger sequencing. All the primers used in this study are listed in Supplementary Table 6.

Open reading frames (ORFs) within the viral genomes were predicted using TransDecoder.LongOrfs in TransDecoder (v5.7.1), and further verified using SnapGene (version 6.0). Conserved domains in the predicted proteins were identified using Motif search or HHpred48 against the Pfam-A_v37 database. The signal peptide and transmembrane regions of viral proteins were predicted using SignalP-6.0 and CCTOP server, respectively49,50. Finally, the viral genomes were visualized using the R package gggenes (version 0.5.1).

Viral abundance estimation

Viral contig read counts were estimated using CoverM v0.7.0 (https://github.com/wwood/CoverM) in contig mode, with alignments performed using BWA (version 0.7.18). The viral read counts were normalized by the library size and genome length according to the following formula:

$$\frac{{mapped\; reads}}{\left(\frac{{total\; reads}}{{10}^{6}}\right)* \left(\frac{length}{{10}^{3}}\right)}=\frac{{mapped\; reads}}{{total\; reads}}* \frac{{10}^{9}}{{length}}$$
(1)

Because of the inherent heterogeneity of SRA datasets, which complicates direct comparisons of viral abundance across different datasets51, viral abundance estimators were only used as assisting methods to distinguish potential contaminant viruses. As an alternative approach, the k-mer coverage calculated by Megahit was used as a preliminary viral abundance estimator during the initial manual viral filtering step. HISAT2 (v2.2.1) was used to map reads to the genome of four intron-bearing yue-like viruses52.

Taxonomic analysis of the downloaded SRA data

Since the downloaded SRA runs originated from different bioprojects with diverse research purposes, the identified viral contigs could potentially have derived from viruses of contaminated organisms such as plants, fungi, or protozoa. To identify the putative hosts of the assembled RNA viruses detected in an SRA run, the taxonomic composition of the organisms within the SRA run was analyzed through taxonomic classification of the raw reads using Centrifuger (v1.0.4-r153)53. The analysis proceeded as follows: First, the preliminary taxonomic composition at the genus level for specific SRA runs exhibiting high candidate virus abundance was inferred based on the taxonomy of matching sequences (with an 85% amino acid identity threshold) of assembled contigs. This step was the same methodology described in the “Identification of viral sequences” section, which was conducted using translated contigs as queries in an MMseqs2 search against the clustered NR database. Subsequently, the representative genomes of species belonging to the genera were retrieved from GenBank and used to construct a custom index using the centrifuger-build tool within Centrifuger53. The raw reads were then classified using the centrifuge, and the classification results were summarized into reports using the centrifuger-kreport tool within Centrifuger53. These reports were converted into a krona-compatible format using kreport2krona.py script54, and further visualized using KronaTools (version 2.8.1)55. A virus was classified as a contaminant only if it exhibited significant similarity to a known virus, and the host sequences of that known virus were also present in the SRA runs.

Phylogenetic analysis

The RdRP domains of the newly identified viruses were used for phylogenetic analysis. To expedite the multiple sequence alignment of a large number of sequences, the sequences were initially aligned using FAMSA (version 2.2.3-5efa514)56. Subsequently, the RdRP domains were extracted and further aligned using Muscle5 (version 5.1) with the -super5 setting57. The alignments were manually inspected and refined in Jalview58, followed by trimming the ambiguously aligned regions using ClipKIT (version 2.3.0)59. The trimmed alignments were subjected to phylogenetic analysis using IQ-TREE (version 2.3.0) with 10,000 ultrafast bootstrap replicates with default settings60. The resulting phylogenetic trees were visualized using ggtree (version 3.10.0)61 and further refined for publication using Adobe Illustrator.