Abstract
Common bean (Phaseolus vulgaris L.) is a crop rich in protein, minerals, and starch. Viruses are a significant limiting factor in increasing the production of legumes, particularly common beans. Accurate and timely detection of plant viruses is essential for minimizing disease damage and ensuring food security. To investigate common bean field viruses in Iran, 300 samples of common bean plants showing viral-like symptoms were collected over 3 years, 2020, 2021, and 2022. This study is the first to use total RNA-seq for a virome analysis of common beans in Iran. The results of the total RNA-seq indicated that the common bean samples were infected with sesame curly top virus (SeCTV), beet curly top Iran virus (BCTIV), tomato leaf curl Palampur virus (ToLCPalV), cucumber mosaic virus (CMV), bean common mosaic virus (BCMV), phaseolus vulgaris endornavirus 1 (PvEV1) and phaseolus vulgaris endornavirus 2 (PvEV2). This is the first report of PvEV1 and PvEV2 in Iran. Moreover, these findings revealed the presence of SeCTV and ToLCPalV for the first time in the western part of the country. Furthermore, the analysis of the nucleotide sequences and the phylogeny tree obtained from the complete genome of the two BCTIV isolates in this study, compared to other isolates, indicated the presence of a new strain of BCTIV in the common bean fields. During the three-year study, the detection rate of viruses indicated that BCTIV and BCTV were more prevalent in Lorestan province than in Markazi province. The research findings showed that common bean fields in the central and western regions of the country were infected with seven viruses, with DNA viruses being more prevalent in Lorestan province than in Markazi province. This information should be taken into account when developing management strategies and breeding plans.
Similar content being viewed by others
Introduction
Common bean (Phaseolus vulgaris L.) is an important annual legume in the Fabaceae family. It was domesticated approximately 700 years ago in Mesoamerica and the Andes1. Among the species of Phaseolus spp., the common bean is the most important crop due to its high cultivation area, accounting for 90% of the cultivated area of bean species2. It is a primary source of nutrition for millions of people because of its high protein content. The highest production of dry beans in the world is in Asia (43%), followed by America and Africa with 29% and 26%, respectively3. In Iran, the area under cultivation of legumes is more than 613,000 hectares, and the total production is about 500,000 tons annually4.
Viruses are a serious threat to global food security and sustainability5. Common bean is a natural host to more than 70 viruses6, with potyviruses being particularly noteworthy due to their prevalence and the damage they cause7. Additionally, begomoviruses have been recognized as a major threat to common bean cultivation in South/Central America and the Caribbean region8.
Recent advancements in high-throughput sequencing (HTS) technology have enabled researchers to identify the virome status of the common bean in several countries; in India, the viral metatranscriptomic approach detected bean common mosaic virus (BCMV), bean common mosaic necrosis virus (BCMNV), and clover yellow vein virus9, in Italy, the MinION sequencing detected BCMV, cucumber mosaic virus (CMV), peanut stunt virus, and bean yellow mosaic virus in Lamon Bean10, in the Mesoamerican, the small RNA sequencing detected nine viral species of five genera in the common bean11, In Tanzania, the HTS technique identified 15 species of viruses belonging to 11 genera in common bean fields12. In addition, two cryptic double-stranded RNA viruses, which are classified under the genus Alphaendornavirus were found in seeds obtained from farmers in three agricultural research zones in Tanzania13.
Common bean is cultivated in various regions of Iran, resulting in genetically distinct known and unknown viruses. Therefore, molecular and biological documentation is essential to assess the genetic diversity of plant viruses. This will help to improve diagnostic and predictive methods. To date, no comprehensive study using HTS has been conducted on common bean viruses, and notably, there has been no multiyear survey. For this reason, the present study employed the metatranscriptomic method to profile the common bean viruses in Iran. Additionally, the detection rate of some important viruses was monitored during the years 2020 to 2022.
Results
Sampling, RNA extraction and library synthesis
Sample were collected from plants exhibiting virus-like symptoms, including leaf deformation, mottling, curling, mosaic, and necrosis (Fig. 1). Finally, a total of 300 samples were collected from symptomatic plants from 2020 to 2022 and used for multiyear survey of detection rate. One pooled sample of 20 samples with varying symptoms was used for HTS, resulting in 12 GB of sequence data comprising a total of 89,548,789 clean reads from Illumina sequencing.
Metagenomic analysis for the detection of viruses
The results of BLAST analysis using contigs as queries against viral RefSeq (NCBI database) indicated that contigs related to some RNA and DNA viruses are present in the sample (Fig. 2).
The analysis of raw data showed that there are three complete virus genomes and a 200 bpcontig of the Becurtovirus, Turncurtovirus, and Begomovirus genera in the Geminiviridae family in the transcriptome data. Assembled 2900 bp contig showed 99.6% identity with sesame curly top virus (SeCTV) with accession no of MH595448. Two other contigs were assembled from this group and had lengths of 2843 and 2848 bp. BLAST analysis showed that contig 2848 bp had 92.9% identity with beet curly top Iran virus (BCTIV) with accession no of JX945569 and contig 2842 bp had 97.7% identity with BCTIV (JQ707948). Notably, observed identity between two BCTIV-related contigs, showed that these two contigs were different isolates because the level of identity between them was 89.9%. Contig 2848 bp might be a new strain of BCTIV. A 200 bp contig related to DNA-A segment of tomato leaf curl Palampur virus (ToLCPalV) was assembled. Based on the number of virus-associated contigs (10 contigs) and reads (25%) BCTIV (contig 2843 bp) had the largest population (Fig. 2).
In addition, contig associated with RNA viruses belonging to the Bromoviridae, Potyviridae, and Endornaviridae families were recovered. In the case of Bromoviridae, the nearly complete genome of CMV from Cucumvirus genus three contigs 2379 bp, 1504 bp, and 2145 bp had 99.9%, 99.5%, and 98.5% identity with (LC066468-RNA1), (LC066466-RNA2), and (LC066503-RNA3), respectively. A 1491 bp contig showed 99.2% identity with BCMV (MF498887) from the genus Potyvirus in the Potyviridae family. Two long contigs with lengths of 13,898 bp and 14,839 bp were matched with 98.9% and 99.5% identity to phaseolus vulgaris endornavirus 1 (PvEV1) (MK948542-Full genome) and phaseolus vulgaris endornavirus 2 (PvEV2), (OM112199-Full genome) respectively, from the Endornaviridae family. The transcript reads were aligned to the reference genome of the identified viruses to determine the level of virus genome expression. Results showed that CMV has a higher expression level (20 contigs and 30% of mapped reads) than PvEV1 and PvEV2 (Fig. 2).
Genome annotation and phylogenetic analysis of identified viral genomes
The identified SeCTV (PP573856) isolate contained a genomic DNA of 2900 nt with six open reading frames (ORFs) (C1, C2, C3, and C4 on the complement strand; V1 and V2 on the viral strand) (Fig. 3a, Table 1). Phylogenetic analysis indicated that SeCTV recovered in the current study, formed a distinct branch between MT041693 (isolated from Solanum melongena) and MH595444 (isolated from Neoaliturus haematoceps) (Fig. 4). Similarly, the potential evolutionary relationship of two BCTIV contigs (BCTIV 1-Contig 2842 bp and BCTIV 2-Contig 2848 bp with the accessions number of PP573857 and PP789055, respectively) and other known isolates was analyzed based on the complete viral genome. BCTIV 1 and 2 contained 5 ORF (C1 and C2 on the complement strand; V1, V2, and V3 on the viral strand) (Fig. 3b,c, Table 1). Phylogenetic analysis showed that 44 BCTIV isolates were clustered into two subgroups, and BCTIV 1 and other 21 isolates belonged to subgroup Ι; BCTIV2 with well-supported probability values formed a distinct branch within subgroup Ι isolates (Fig. 4), suggesting that BCTIV 2 likely represents a new strain of BCTIV.
Annotated viral genome. Schematic representation of the genomic organization of (a) SeCTV, sesame curly top virus (2900 bp in length, contains two ORFs in the viral strand and four ORFs in the complementary strands), (b) BCTIV1, beet curly top Iran virus1 (2842 bp in length, contains three ORFs in the viral strand and two ORFs in the complementary strands), (c) BCTIV2, beet curly top Iran virus2 (2848 bp in length, contains three ORF in viral strand and two ORF in complementary strands).
Bayesian consensus tree of the full genome of detected viruses under the GTR + G + I model. The burn-in phase was set at 25% of the converged runs. (a) Phylogenetic tree of sesame curly top virus, (b) beet curly top Iran virus, (c) cucumber mosaic virus (RNA3-CP gene). Virus genomes derived from this study are indicated by green color. Bootstrap support values for Bayesian analysis are presented in Roman font, while the values for Maximum Likelihood analysis are displayed in italics.
In this study, the complete genome of RNA3 (2145 bp, PP573858), which includes sequences coding for movement protein (from nt 88 to 930) and coat protein (CP) (from nt 1225 to 1881), was assembled for CMV. The CP gene was used for phylogenetic analysis, and the results revealed that all CP sequences were divided into two main groups (I and II). The isolate was clustered with IB subgroup strains, as shown in Fig. 4. Particularly, three Iranian CMV isolates, including LC066464, KU976468, and KT279567, were found to have a close relationship with our isolate.
For the first time, the present study reports PvEV1 (PP573859) and PvEV2 (PP573860) of the Alphaendornavirus genus in the Endornaviridae family in Iran. Using the ORF finder tool and analyzing the complete genome sequences, both PvEV1 and PvEV2 sequences were identified to encode a single putative polyprotein of 4619 and 4920 amino acids (aa), respectively (Fig. 5). Conserved domain analysis at the aa level revealed that PvEV1 contains a helicase domain (1491 to 1734aa), an AAA domain (1670 to 1740aa), a domain family with a P-loop motif characteristic of the AAA superfamily, a capsular polysaccharide synthase (2774 to 2933aa), a glycosyltransferase domain (3102 to 3443aa) and an RdRp domain (4246 to 4482aa) (Fig. 5). In the case of PvEV2, the genome-encoded protein contained a methyltransferase domain (354 to 458aa), a helicase domain (1411 to 1651aa), a glycosyltransferase domain (3108 to 3201aa), and an RdRp domain (4519–4820aa) (Fig. 5). The phylogenetic analyses were carried out using complete genome sequences of PvEV1, PvEV2 and related endornaviruses. As can be seen in the tree (Fig. 5), PvEV 1 showed a close relationship with the Brazilian isolate (NC-039217), and also formed a clade with the Spanish (MF375892) and Slovakian (OQ750684 and OQ750683) isolates. Also, PvEV2 formed a subclade with isolates from Mexico (MN832720), Canada (OM112199), and Kenya (MF281671).
Bayesian consensus tree of the full genome of detected viruses under the GTR + G + I model and conserved domains in the polyprotein encoded by PvEV-1 and PvEV-2. The burn-in phase was set at 25% of the converged runs. Phylogenetic tree of (a) Phaseolus vulgaris endornavirus 1 and (b) Phaseolus vulgaris endornavirus 2. Virus genomes derived from this study are indicated by green color. Numbers indicate the residues defining the conserved domains. Hel-1 Helicase, AAA domain, CPS capsular polysaccharide synthase, GTF glycosyltransferase, MET methyltransferase, RdRp RNA-dependent RNA polymerase.
Recombination analysis and SNP calling
Two out of 19 SeCTV-characterized isolates were recombinant. SeCTV from the current study, was identified as the minor parent of the MH595443.1, which was isolated from Sesamum indicum plants (Table 2). Among the 45 isolates of BCTIV, two isolates of common bean from the present study were recombinant, also four other recombinant isolates were detected (data not shown). The major recombinant isolate (BCTIV -contig 2848 bp) associated with this event was reported from Iran, with ON014774 (BCTIV-Capsicum annuum) and JQ707946 (BCTIV-Beta vulgaris) being the major and minor parents, respectively. Another major recombinant isolate was BCTIV-contig 2842 bp with JQ707940 (Beta vulgaris-major parent) from Iran. However, the minor parent for this recombinant isolate was unknown. The predicted beginning and ending breakpoints for BCTIV -contig 2848 bp and BCTIV-contig 2842 bp were located at 1877–2349 nt and 1975–2013 nt, respectively, of the genomic DNA, covering partial sequences of C1 ORF (Table 2, Fig. 6).
The most frequent recombination events in the beet curly top Iran virus genome (indicated by pink outline). (a) Representing the recombination events detected in BCTIV1 and BCTIV2 isolates. The x-axis indicates the position in alignment, while the y-axis denotes the percentage value for bootstrap support. (b) Represent the probabilities of recombination breakpoints best supported by p-values (displayed by the color key). Dark red peaks marked with white arrows indicate the statistically optimal position of recombination breakpoint pairs.
To assay the configuration of viral quasispecies for naturally infected viruses, SNPs were analyzed for detected viruses. In SeCTV, a total of 49 SNVs were identified, which were evenly distributed along the genome (Fig. 7). For BCTIV (contig 2848 bp) and (contig 2842 bp), 329 and 261 SNVs were identified, respectively, scattered over the genome in coding and non-coding regions. For CMV, only the RNA3 segment was obtained as full genome, the identified polymorphisms (268) were enough to present a highly diverse, natural CMV population in common bean (Fig. 7). The SNPs in viruses were distributed genome-wide without any clear hot spots or cold spots. No SNPs were observed in PvEV1 and PvEV2.
Validation of RNA-seq data and Sanger sequencing
The results of polymerase chain reaction (PCR) and reverse transcription polymerase chain reaction (RT-PCR) were consistent with the RNA-seq results. PCR yielded 1300 bp (nucleotide 451 to 1710), 700 bp (the partial sequence of C1/C2), and 1000 bp (partial sequence of DNA-B) for SeCTV, BCTIV, and ToLCPalV, respectively. RT-PCR results showed specific bonds for CMV (540 bp-partial coat protein sequence of the RNA 3 segment), PvEV1 (374 bp helicase-encoding region), PvEV2 (766 bp helicase-encoding region), and BCMV (200 bp-coat protein region) (Fig. 8). For further confirmation PCR product of each virus was sequenced by Sanger sequencing methods. The results of RT-PCR and PCR indicated that all the viruses identified by HTS practically exist in common bean pooled samples sent for HTS sequencing (Suppl. Table S1).
Confirmation of seven major viruses infecting common bean by using virus-specific primers which were detected by HTS. Amplified fragment of viral genome by PCR and RT-PCR from (a) lane 1 to 5, PvEV2 (766 bp) 1, CMV (540 bp) 2, BCMV (200 bp) 3, PvEV1 (374 bp) 4, negative control 5. (b) ToLCPaMV (1000 bp) 1, BCTIV (700 bp) 2, SeCTV (1300 bp) 3, negative control 4. M is the molecular marker.
Annual virus prevalence survey
The result showed that the virus detection rate and virus population structures in the common bean fields in this area varied each year. ToLCPalV was not detected in Markazi province, while SeCTV was not found in Lorestan province. However, BCTV, BCTIV, CMV, and BCMV were detected in both provinces (Fig. 9).
Virus population structures in Lorestan and Markazi common bean field. (a) The viruses identified in the Lorestan samples are shown in the green region, the viruses identified in the Markazi samples are shown in the red region, and the viruses identified in both provinces are shown in the overlapped area. (b) Percentage of the detection rates of viruses from 2020 to 2022, each year represented by a defined color legend bar.
In Lorestan, the detection rate of BCTV and BCTIV was higher than in Markazi province. In the case of RNA viruses BCMV and CMV had a higher detection rate in Markazi than in Lorestan provinces (Fig. 9). Generally, in Markazi the detection rate of RNA viruses was higher than in Lorestan province. In Lorestan among all viruses, CMV and BCTIV had the highest detection rate. While this event was observed in Markazi related to CMV and BCMV. In Markazi, the detection rate of CMV in collected samples in 2020, 2021 and 2022 was 32.7, 39.8 and 40.2% respectively, and in the case of SeCTV was 10, 9, and 11.8% respectively in annual characterization. In Lorestan, CMV and BCTIV showed an increasing trend, while the trend of ToLCPalV was in decreased (Fig. 9). BCMV in two provinces during three years had the same trend but the percent of detection rate in Markazi was higher than in Lorestan (Fig. 9).
Of the infected common bean samples, some samples were co-infected with two or three different viruses (i.e., BCTV + BCTIV, BCTIV + CMV, BCTIV + SeCTV, BCMV + CMV, SeCTV + CMV, ToLCPalV + CMV, BCTV + BCTIV + CMV). Among these mixed infections, BCTIV + CMV was the most common combination in most of the surveyed locations, Also, CMV was present in most of the mixed infections.
Discussion
In this study, seven plant virus species, from six different genera including, Begomovirus, Becurtovirus, Turncurtovirus, Potyvirus, Cucumovirus and Alphaendornavirus, were detected in the main common bean production area in Iran. This study is the first to report PvEV1 and PvEV2 infections in the common bean fields in Iran and the presence of SeCTV in the west of Iran.
SeCTV is a species of the genus Turncurtovirus in the family Geminiviridae. It was first reported in sesame plants in Iran14. Although SeCTV has merely been reported in the eastern regions of Iran14,15, the current study demonstrated that SeCTV has a relatively high distribution in the western and central areas of the country. According to the results of the annual survey, this virus is expanding and could be considered a risk to common bean production. Moreover, in the previous study, we detected ToLCPalV, another geminivirus, for which a 200 bp contig was recovered in the HTS data16.
Two complete genomes of BCTIV were identified and designated as BCTIV1 and BCTIV2. BCTIV is known to damage economically important crops, including sugar beet, tomato, pepper, and common bean17,18,19. Multiple sequence alignment, nucleotide identity calculations, and phylogenetic analyses demonstrated that BCTIV1 and BCTIV2 are distinct isolates. DNA sequence analysis indicated that BCTIV 2 shared 92.9% identity with the previously reported isolate from Iran. The phylogenetic analysis of the complete genome sequences of all BCTIV isolates revealed that BCTIV2 forms a clade with an independent branch of isolates from Khorasan Razavi province, while BCTIV1 was grouped with isolates reported from Kerman province. This suggests that a new strain of BCTIV may be emerging in western Iran. According to the current taxonomic demarcation criteria (≤ 89% identity for species and ≥ 91% identity for isolates)20, the BCTIV1 and BCTIV2 genomes represent two distinct isolates of BCTIV that infect common beans in the Lorestan and Markazi provinces.
Genetic variation is a significant factor in the emergence of new virus strains, and one of the primary sources of this variation is recombination21,22. In BCTIV1 and BCTIV2, the recombination event was traced with high probability. Studies have shown that recombination plays a crucial role in the evolution of DNA viruses23,24. For instance, the recombinant tomato yellow leaf curl virus (TYLCV) observed in southern Morocco causes more severe symptoms than its parental virus strains23. Consequently, the recombinant viruses identified in this study may be more virulent than BCTIV strains or isolates from previous reports.
The identification of individual virus variants within a mixed population through variant calling is crucial for the interpretation of viral evolution and genetic diversity25. Consistent with these findings, SNP analysis revealed the presence of quasispecies in BCTIV and SeCTV. Several SNPs were detected throughout the genome of BCTIV1, BCTIV2, and SeCTV. This study, along with our previous study16, provides further evidence of the importance of geminiviruses in common bean fields in western Iran.
Among RNA viruses, CMV and BCMV are major infecting common bean viruses26,27. CMV is a type species of the Cucumovirus genus and is one of the most common plant viruses found in agricultural ecosystems28. Meanwhile, BCMV is a species in the genus Potyvirus in the Potyviridae family. Both CMV and BCMV are aphid-transmitted viruses that can also be seed-transmissible in common beans with an efficiency of up to 80%29,30. In the transcriptome data, three RNA segments of CMV were identified, along with a 1491 bp sequence related to BCMV. RNA 3 from CMV which contains MP and CP genes, represents the complete genome of the virus. Phylogenetic analysis of the CP gene compared with other isolates demonstrated that the CMV strain of the present study belongs to the IB subgroup and shares a close relationship with three Iranian isolates (LC06646, KU976468, and KT279567). Previous studies have suggested that CMV subgroup IB strains are spreading from Asia and may emerge more widely in other regions31,32. These findings suggest that subgroup IB is prevalent in Iran and its neighboring countries31 and is on the rise. Since CMV is usually the primary concern in mixed infections, it is important to focus on this virus in breeding programs. Although the complete genome of BCMV could not be reconstructed, RT-PCR results confirmed the presence of this virus in symptomatic samples collected from Lorestan and Markazi provinces.
In the present study, PvEV1 and PvEV2 from the Alphaendornavirus genus of the Endornaviridae family are reported for the first time in Iran. The Endornaviridae family has a genome consisting of a linear, single-stranded RNA with 9.8–17.6 kb in length33. Plant endornaviruses are transmitted through seeds, and pollen; They are sometimes referred to as ‘persistent’ viruses and they do not exhibit visible symptoms34,35. The full genome of PvEV1 and PvEV2 were assembled from the HTS data. Bioinformatics and phylogenetic analyses of the alphaendornaviruses revealed that the obtained contigs clustered together with PvEV1 and PvEV2 and formed a clade consistent with the Alphaendornavirus genus. Consequently, the analysis indicated that the genome of PvEV1 and PvEV2 from the current study are consistent with those of previously characterized (NC-039217 and OM112199 respectively) endornaviruses13,36.
Also, RT-PCR analysis confirmed that PvEV1 and PvEV2 species are present in common bean fields located in Lorestan and Markazi provinces. Both viruses were found in single and mixed infections, consistent with previous reports of mixed infections of PvEV1 and PvEV2 in common bean by Refs.31,37. Furthermore, a comprehensive investigated the incidence of PvEV1 and PvEV2 in 68 breeding lines/cultivars, revealing that 63 of these were co-infected with both viruses in regions of Mesoamerica and the Andes38. The study conducted by Ref.39 revealed that PvEV3 was detected in cultivated and wild common bean genotypes as single and mixed infections with PvEV1 and PvEV2. Although it is possible that PvEV3 exists in Iranian common bean cultivars, it was not detected in this study. It is important to investigate the potential interactions of alphaendornaviruses in mixed infections with other viruses, and their secondary effects on host plants, vectors, and viruses in general.
Results from a multiyear survey on virus incidence revealed that the virus population structures in common bean fields in Lorestan and Markazi provinces are complex. The study conducted from 2020 to 2022 showed that six viruses, CMV, BCTIV, BCTV, SeCTV, ToLCPalV, and BCMV, occurred in both mixed and single infections. Different combinations of mixed infections, with CMV, BCTV, and BCTIV have been found to increase the severity of symptoms and damage in common beans40. Although SeCTV was not detected in the fields of Lorestan, the presence of its vector in this region suggests that the virus may still be present. Since Lorestan and Markazi are bordering provinces, there is a potential for the virus to spread from one province to another. The incidence of CMV and BCTIV gradually increased in both provinces, whereas BCTV remained relatively stable from 2020 to 2022. It is worth to note that there has been no comprehensive annual survey of common bean viruses in Iran. These findings offer critical insights for designing virus disease management strategies for common bean cultivation in Iran.
Conclusions
This study demonstrates HTS as a powerful tool in viromics research. To our knowledge, this research provides the first comprehensive understanding of the viral landscape affecting common bean in Iran. One of the interesting outcomes of this study is the simultaneous detection of both RNA and DNA viruses that infect common beans. The data of this study provide not only a list of common bean viruses but also assess their possible relationships and multiyear incidence. Moreover, for the first time, PvEV1 and PvEV2 are reported from common bean fields in Iran. Comprehensive knowledge of viruses affecting common beans will help to improve the management strategies, including the use of virus-free material, the development of resistant cultivars, and the reduction of chemical control for insect vectors. Given the significance of the common bean for the food security in Iran and the world, and taking into account the annual occurrence of mixed infections, it is recommended to avoid the monoculture practices in infected areas and use virus-free seeds and resistant cultivars.
Methods
Field survey and sample collection
During the growing season of common bean, leaf samples exhibiting symptoms of virus-like diseases were collected from 5 to 10 common bean fields in each region of Markazi and Lorestan provinces. Based on the leaf symptoms and the sampling region, in 2023 a total of 20 samples were mixed and subjected to deep sequencing analysis (Fig. 1). In addition, considering that in the current research, samples were taken from the main areas of bean cultivation, sampling was performed to determine the virus detection rate in a multiyear survey during of 2020 and 2022 of virus incidence. The detection rate was determined by categorizing samples from each year based on symptoms, resulting in 35 samples for each year. These samples were then tested using PCR and RT-PCR with specific primers. The ratio of positive infection cases to the total number of samples checked for each virus was reported as the detection rate for each virus in each year. The virus incidence was calculated as the percentage of plant samples infected with each virus. After identifying viruses through HTS analysis, their detection rate was checked with specific primers for each virus in both provinces.
RNA isolation, library preparation and sequencing
Leaf samples were ground into fine tissue powder using a mortar and pestle with liquid nitrogen. The resulting powder was immediately transferred into the Eppendorf tube 1.5 µl and stored in liquid nitrogen. Total RNA was extracted from leaf samples according to the method described by Ref.41. TruSeq Stranded Total RNA with Ribo-ZeroTM Gold Kit (Illumina, Italy) was used to remove ribosomal RNA and construct the RNA-Seq library according to the manufacturer’s instructions. The quality and quantity of the extracted RNA were assessed using an Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA) and gel electrophoresis.
Bioinformatics analysis
The raw sequence reads were trimmed and adaptors were removed. High-quality reads were collected using Fastp (an ultra-fast FASTQ preprocessor)42. The reads related to the host genome were removed by mapping the reads to the reference genome of Phaseolus vulgaris43, and then collecting the unmapped reads. Contigs were primarily assembled from the unmapped reads using CLC Genomics Workbench v. 20 (Qiagen, USA) and Geneious Prime 2019.1.3 software. The assembled contigs were subjected to a BLASTx search against the viral protein database at the National Center for Biotechnology Information (NCBI) with E-value 1e-10 as a cutoff. Contigs with high sequence identity to similar virus species were compared to the NCBI non-redundant virus nucleotide database using BLASTn to identify virus isolates. Finally, virus-associated contigs were retained for further analyses.
Genome annotation and classification of viruses
To annotate and generate complete or nearly complete genomes of detected viruses, each virus-associated contig was aligned to the reference viral genomes by the ClustalW algorithm implemented in the Geneious Prime software. The ORF finder program (https://www.ncbi.nlm.nih.gov/orffinder/) was used to predict ORFs of each virus, then checked based on the known genome organization and open reading frame composition reported on the International Committee on Taxonomy of Viruses (ICTV) database. Annotated viruses were taxonomically classified based on the percent pairwise identity, following the standard genomic criteria reported by the ICTV44.
Recombination analysis and SNP calling
In this section, only viruses that contained complete genomes were examined. Genomes of different isolates retrieved from GenBank and viral RefSeq database were aligned and checked for possible recombinants. The RDP4 program was used to explore the occurrence of recombination events in full-length viral genome sequences. Various methods implemented in the Recombination Detection Program (RDP) V.445 including RDP, SisterScan, Bootscan, Chimaera, GeneConv, MaxChi, and 3Seq algorithms, with a Bonferroni-corrected P-value of < 0.05, were used. Signals detected by seven algorithms were considered reliable recombinant events. For single nucleotide polymorphisms (SNPs) analysis, we used the SAMtools kit to identify SNPs of identified viruses46.
HTS data validation and Sanger sequencing
The presence of DNA and RNA viruses identified by HTS was confirmed by RT-PCR and PCR through the extraction of RNA47 and DNA48 from samples used for HTS library preparation. The complementary DNA (cDNA) synthesis was done using the First Strand cDNA Synthesis Kit (SinaClon cDNA Synthesis Kit) following the manufacturer’s instructions. For RT-PCR reactions the following steps were used: 50 °C for 30 min, 95 °C for 15 min, followed by 30 cycles of 95 °C for 20 s, 50–60 °C for 40 s (annealing temperature can be variable depending on the Tm values of the primers (Table 3) and 72 °C for 1 min, with a final extension at 72 °C for 5 min. Also, for PCR reactions the following conditions were used: an initial denaturation step of 95 °C for 2 min, followed by 35 amplification cycles of 95 °C for 30 s, 48–59 °C for 30 s, 72 °C for 1 min, and final extension at 72 °C for 10 min. The sequence of specific primers for each virus with the annealing temperature is listed in Table 3. Positive RT-PCR and PCR of each virus were sequenced for reconfirmation using Sanger sequencing (Topaz gene kavosh Company-Iran).
Bayesian phylogenetic analysis
The closest relative sequences of globally reported viruses that were detected in the current study were retrieved from GenBank based on BLASTn analysis (http://www.ncbi.nlm.nih.gov). The dataset for each virus sequence separately was aligned using the Q-INS-i algorithm of the online version of MAFFT (https://mafft.cbrc.jp/alignment/server)49. The Gblocks program with all three less stringent algorithms (with default parameters) (http://phylogeny.lirmm.fr/phylo_cgi/one_task.cgi?task_type=gblocks), was used for post-editing of both alignments50. The model of base substitution was selected using MrModeltest 251. The Akaike-supported model, a general time reversible model, including among-site rate heterogeneity and estimates of invariant sites (GTR + G + I) was used. Bayesian analyses were carried out using MrBayes v3.1.252 and a random starting tree, running the chains for 1,000,000 generations for each dataset. The burn-in phase was set at 25% of the converged runs. The Markov chain Monte Carlo (MCMC) method within a Bayesian framework was used to estimate the posterior probabilities of the phylogenetic tree53 using the 50% majority rule.
Data availability
Raw data that support the findings of this study have been deposited in the SRA database with BioProject Accession Number PRJNA1147914. Derived data supporting the findings of this study are available from the corresponding author M.S.B. on request.
References
Chacon, S. et al. Domestication patterns in common bean (Phaseolus vulgaris L.) and the origin of the Mesoamerican and Andean cultivated races. Bioinformatics 110, 432–444 (2005).
Morales, F. Common beans. In Loebenstein, Natural Resistance Mechanisms of Plants to Viruses 367–382 (Springer, 2006).
Uebersax, M. A. et al. Dry beans (Phaseolus vulgaris L.) as a vital component of sustainable agriculture and food security. Legum. Sci. 5, e155 (2023).
Anonymous. Annual Report of Agricultural Products of Iran (Statistics of Agricultural Products of Iran, 2023).
Rojas, M. R. & Gilbertson, R. L. Emerging Plant Viruses: A Diversity of Mechanisms and Opportunities 27–51 (Springer, 2008).
Morales, F. Virus diseases of beans in the tropics. Trop. Plant Pathol. 1, 1 (1986).
Singh, S. P. & Schwartz, H. F. Breeding common bean for resistance to diseases: A review. Crop Sci. 50, 2199–2223 (2010).
Morales, F. J. & Anderson, P. K. The emergence and dissemination of whitefly-transmitted geminiviruses in Latin America. Arch. Virol. 146, 415–441 (2001).
Rashid, S. et al. Viral metatranscriptomic approach to study the diversity of viruses associated with common bean (Phaseolus vulgaris L.) in the North-Western Himalayan region of India. Front. Microbiol. 13, 943382 (2022).
Tarquini, G. et al. The virome of ‘Lamon Bean’: Application of MinION sequencing to investigate the virus population associated with symptomatic beans in the Lamon Area, Italy. Plants 11, 779 (2022).
Chiquito-Almanza, E. et al. Diversity and distribution of viruses infecting wild and domesticated Phaseolus spp. in the Mesoamerican center of domestication. Viruses 13, 1153 (2021).
Mwaipopo, B. et al. Comprehensive surveys of bean common mosaic virus and bean common mosaic necrosis virus and molecular evidence for occurrence of other Phaseolus vulgaris viruses in Tanzania. Plant Dis. 102, 2361–2370 (2018).
Nordenstedt, N. et al. Pathogenic seedborne viruses are rare but Phaseolus vulgaris endornaviruses are common in bean varieties grown in Nicaragua and Tanzania. PLoS ONE 12, e0178242 (2017).
Hasanvand, V. et al. Identification of a new turncurtovirus in the leafhopper Circulifer haematoceps and the host plant species Sesamum indicum. Virus Genes 54, 840–845 (2018).
Hasanvand, V. & Heydarnejad, J. First report of sesame curly top virus infecting vegetables and ornamental plants in Iran. Plant Pathol. J. 102, 1381–1381 (2020).
Astaraki, S. & Shams-Bakhsh, M. Screening for tomato leaf curl Palampur virus resistance in common bean (Phaseolus vulgaris L.) cultivars through phytochemical characterization and enzyme activity analysis. Physiol. Mol. Plant Pathol. 126, 102043 (2023).
Gharouni Kardani, S. et al. Diversity of beet curly top Iran virus isolated from different hosts in Iran. Virus Genes 46, 571–575 (2013).
Heydarnejad, J., Keyvani, N., Razavinejad, S., Massumi, H. & Varsani, A. Fulfilling Koch’s postulates for beet curly top Iran virus and proposal for consideration of new genus in the family Geminiviridae. Arch. Virol. 158, 435–443 (2013).
Tahan, V. et al. Characterization of beet curly top Iran virus infecting eggplant and pepper in north-eastern Iran. Indian Phytopathol. 73, 577–581 (2020).
Brown, J. K. et al. Revision of begomovirus taxonomy based on pairwise sequence comparisons. Arch. Virol. 160, 1593–1619 (2015).
Worrall, E. A. et al. Bean common mosaic virus and bean common mosaic necrosis virus: Relationships, biology, and prospects for control. Adv. Virus Res. 93, 1–46 (2015).
Zhou, G.-C. et al. A genomic survey of thirty soybean-infecting bean common mosaic virus (BCMV) isolates from China pointed BCMV as a potential threat to soybean production. Virus Res. 191, 125–133 (2014).
Belabess, Z. et al. A Recombinant Tomato Yellow Leaf Curl Virus has Replaced Its Parental Viruses in Southern Morocco 151 (INRA, 2016).
Garcia-Andres, S. et al. Frequent occurrence of recombinants in mixed infections of tomato yellow leaf curl disease-associated begomoviruses. Virology 365, 210–219 (2007).
Rollin, J. et al. Detection of single nucleotide polymorphisms in virus genomes assembled from high-throughput sequencing data: Large-scale performance testing of sequence analysis strategies. PeerJ 11, e15816 (2023).
Grogan, R. The relation of common mosaic to black root of bean. J. Agric. Res. 77, 315 (1948).
Tang, M. & Feng, X. J. A. Bean common mosaic disease: Etiology, resistance resource, and future prospects. Agronomy 13, 58 (2022).
Palukaitis, P. & Garcia-Arenal, F. Cucumoviruses. Adv. Virus Res. 62, 241–323 (2003).
Hord, M. et al. Field survey of cucumber mosaic virus subgroups I and II in crop plants in Costa Rica. Plant Dis. 85, 952–954 (2001).
Davis, R. F. & Hampton, R. O. Cucumber mosaic virus isolates seedborne in Phaseolus vulgaris: Serology, host–pathogen relationships, and seed transmission. Phytopathology 76, 999–1004 (1986).
Mutuku, J. M. et al. Metagenomic analysis of plant virus occurrence in common bean (Phaseolus vulgaris) in Central Kenya. Front. Microbiol. 9, 2939 (2018).
Rabie, M. et al. Phylogeny of Egyptian isolates of cucumber mosaic virus (CMV) and tomato mosaic virus (ToMV) infecting Solanum lycopersicum. Eur. J. Plant Pathol. 149, 219–225 (2017).
Valverde, R. A. et al. ICTV virus taxonomy profile: Endornaviridae. Eur. J. Plant Pathol. 100, 1204–1205 (2019).
Roossinck, M. J. Lifestyles of plant viruses. Arch. Virol. 365, 1899–1905 (2010).
Fukuhara, T. & Gibbs, M. Nomenclature of Viruses: Ninth Report of the International Committee on Taxonomy of Viruses. Family Endornaviridae 855–880 (Elsevier, 2012).
Mrkvova, M. et al. Phaseolus vulgaris alphaendornavirus-1 is frequent in bean germplasm in Slovakia and shows low molecular variability. Acta Virol. 67, 11484 (2023).
Okada, R. et al. Molecular characterization of two evolutionarily distinct endornaviruses co-infecting common bean (Phaseolus vulgaris). J. Gen. Virol. 94, 220–229 (2013).
Khankhum, S. et al. Two endornaviruses show differential infection patterns between gene pools of Phaseolus vulgaris. Arch. Virol. 160, 1131–1137 (2015).
Okada, R. et al. Genomic sequence of a novel endornavirus from Phaseolus vulgaris and occurrence in mixed infections with two other endornaviruses. Virus Res. 257, 63–67 (2018).
Astaraki, S. et al. Reaction of sugar beet, pepper and bean plants to co-infection with cucumber mosaic virus and beet curly top viruses. Iran. J. Plant Pathol. 56, 221–236 (2021).
Cordeiro, M. C. R. et al. Optimization of a method of total RNA extraction from Brazilian native plants rich in polyphenols and polysaccharides. In Proceedings of the Simposio Internacional Savanas Tropicais ParlaMundi, Brasilia, Brazil 12–17 (2008).
Chen, S. et al. Fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Schmutz, J. et al. A reference genome for common bean and genome-wide analysis of dual domestications. Nat. Genet. 46, 707–713 (2014).
Lefkowitz, E. J. et al. Virus taxonomy: The database of the International Committee on Taxonomy of Viruses (ICTV). Nucleic Acids Res. 46, D708–D717 (2018).
Martin, D. P. et al. RDP4: Detection and analysis of recombination patterns in virus genomes. Virus Evol. 1, 003 (2015).
Rosario, K. et al. Pepper mild mottle virus as an indicator of fecal pollution. Appl. Environ. Microbiol. 75, 7261–7267 (2009).
Yu, D. et al. Comparison and improvement of different methods of RNA isolation from strawberry (Fragria ananassa). J. Agric. Sci. 4, 51 (2012).
Doyle, J. J. & Doyle, J. L. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytoche. Bull. 19, 11–15 (1987).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000).
Nylander, J. MrModeltest v2 (Uppsala University, 2004).
Ronquist, F. & Huelsenbeck, J. P. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574 (2003).
Larget, B. & Simon, D. L. Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Mol. Biol. Evol. 16, 750–759 (1999).
De Blas, C. et al. Broad spectrum detection of cucumber mosaic virus (CMV) using the polymerase chain reaction. J. Phytopathol. 3, 323–329 (1994).
Heydarnejad, J. et al. Incidence and natural hosts of tomato leaf curl Palampur virus in Iran. Australas. Plant Pathol. 42, 195–203 (2013).
Azizi, A. et al. Efficient silencing gene construct for resistance to multiple common bean (Phaseolus vulgaris L.) viruses. 3 Biotech 10, 1–10 (2020).
Ebadzad, S. G. et al. Infectivity of the cloned genome of Iranian isolate of beet severe curly top virus in experimental hosts. Plant Pathol. J. 44, 176–183 (2008).
Segundo, E. et al. Occurrence and incidence of viruses infecting green beans in south-eastern Spain. Eur. J. Plant Pathol. 4, 579–591 (2008).
Acknowledgements
We gratefully acknowledge financial support from Tarbiat Modares University, Tehran, Iran.
Author information
Authors and Affiliations
Contributions
S.A. carried out the experiments and wrote the draft of the manuscript. M.R.A. advised the project, and M.S.B. supervised the project and edited the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Astaraki, S., Atighi, M.R. & Shams-bakhsh, M. High-throughput sequencing revealed the symptomatic common bean (Phaseolus vulgaris L.) virome in Iran. Sci Rep 15, 1621 (2025). https://doi.org/10.1038/s41598-025-85281-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-85281-y












