Introduction

Evolutionary processes in the ecosystem that contributed significantly to speciation in ecologically and genetically divergent populations have drawn the attention of many1,2 and become one of the most fascinating aspects in evolutionary biology, leading to the study of speciation in phytophagous insect pests3,4. One perception of speciation is more like a continuous phase involving polymorphic populations as they evolve into distinct species called ecological races5. Populations of morphologically conserved lineages that are genetically divergent and perhaps even reproductively isolated are referred to as cryptic species or sibling species because of their previous classification mostly as a single taxon, based on similar morphologies. Cryptic species are more common than previously anticipated and are now known to occur across major metazoan taxa and biogeographic ranges6. With the recent advances in molecular and genetic solutions, the identification and description of cryptic species has increased exponentially over the last two decades7.

Multiple co-distributed cryptic lineage studies incorporating phylogeographic and population genetic perspectives provide an excellent framework for understanding cryptic biodiversity8. Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae) is one such cryptic complex for which the taxon was first described in 1889 as Aleyrodes tabaci9. This species complex comprises an unprecedented number of cryptic lineages globally10,11, some of which currently overlap in geographic coverage12. Morphologically indistinguishable lineages of B. tabaci were typically categorised as biotypes, many of which have recently been described as mitochondrial (mtCOI) haplotypes, and exhibit variability in some biological and ecological traits such as the efficiency of plant virus transmission, insecticidal tolerance, dispersal, mating behaviour, and fecundity12,13. The whitefly, B. tabaci cryptic species group comprises morphologically indiscernible lineages of well-known exemplars referred to as biotypes. The destructive sap sucking pest distributed throughout tropical and subtropical latitudes includes the contemporary invasive haplotypes, termed B and Q14. Several well-studied B. tabaci cryptic species exhibit ecological and biological diversity; however, most members are poorly studied or completely uncharacterized.

A considerable effort has already been made to overcome the systematics of this species complex. The previous classification structure of B. tabaci populations into biotypes and host races based on different biological and biochemical markers has been superseded by molecular investigations based on the partial sequence of the mtCOI gene15,16,17. In addition, the genetic distance threshold of 3.5% was defined based on the difference in the distribution of pairwise sequence divergence between the unique mtCOI partial sequences of B. tabaci16. In the subsequent recent analyses, the pattern of putative species clusters were delineated by sequence divergence equal to or higher than 3.5%18.

The distribution of B. tabaci, distribution stretches across the continents, with the Indian subcontinent being postulated as its centre of origin19. It is an agriculturally and economically important insect pest because of its polyphagous nature and super-vectoring of begamoviral diseases20. Presently, B. tabaci is regarded as a complex comprising at least 46 genetic groups21,22 and nearly 13 genetic groups are recorded so far from India, viz., Asia I, Asia I-India, Asia II 1, Asia II 5, Asia II 6, Asia II 7, Asia II 8, Asia II 11, Asia II 13, Middle East Asia Minor (MEAM)-1, MEAM-K, China 3 and China 721,23,24. The eastern and southern parts of India showed maximum diversity compared to northern and north-western India, respectively, with the prevalence of Asia I and Asia II genetic groups. The pattern of dispersion of B. tabaci is driven by a number of factors involving geographical area, host plants, and export-related anthropogenic activities25,27. The Asia II 1 genetic group was found to be more predominant with its widespread distribution across the tropical, subtropical, and temperate zones of India and exhibited the highest haplotype diversity28. Asia II 1 and Asia II 7 are the leading genetic groups occurring in Delhi27 and especially, Asia II 1 is the most abundant and predominantly distributed genetic group in cotton leaf curl disease (CLCuD)-prone north Indian cotton growing states29,30 and in Punjab, the incidence of CLCuD is closely associated with Asia II 1, and the disease incidence was much lower in the areas where Asia II 1 is absent29. Major cotton-growing areas of Pakistan also depicted Asia II 1 as a major genetic group31.

To reveal the prevalence and dominance of constantly evolving B. tabaci genetic group complex, one needs a better understanding of its phylogeographical patterns of genetic variation, species composition, haplotype diversity, migration histories, and demographic records. Dispersion and distribution of Asia II 1 stretches across the country in the recent past and are found prevalent in all the agroclimatic zones of India. This paper aims to reveal the haplotype diversity, phylogenetic status, and genetic differentiation of the Asia II 1 genetic group of B. tabaci across Asia and India, as well as its genetic variation on different host plants in India.

Materials and methods

B. tabaci sample collection

Specimens of B. tabaci were collected from farmer’s fields in various locations as listed in Suppl. File 1; Table S1. Permissions were taken from the farmers for the collection of samples on their land. The adult whiteflies were collected using a hand-held aspirator in 1.5 ml sample collection tubes (Abdos Life Sciences P10204) and stored in 70% ethanol at 200C until further analysis.

Extraction of DNA

Individual whitefly was used for DNA extraction. The whiteflies stored in ethanol were washed twice with sterile water before extraction. Genomic DNA of each whitefly was extracted using the DNASure Mini Tissue Kit (Nucleopore, Genetix NP61305) as per the manufacturer’s protocol and stored at -200C until further use.

mtCOI PCR amplification and sequencing

A partial mtCOI gene fragment of about 820 bp was amplified using the universal primers C1-J-2195 (5’—TTGATTTTTTGGTCATCCAGAAGT-3’) and TL2-N-3014 (5’—TCCAATGCACTAATCTGCCATATTA-3’)32. PCR amplification was carried out in a 25 µl reaction mixture containing 12.5 µl of ready-to-use PCR master mix (Promega M750A), 5.5 µl of nuclease-free water, 1 µl each of forward and reverse primers, and 5 µl of insect DNA. Thermal cycling was performed on a 96-well thermal cycler (Applied Biosystems, Thermo Fisher Scientific) with reaction conditions as follows: initial denaturation for 10 min at 94°C, followed by 35 cycles of 94°C –30 s, 48°C –30 s for annealing, and 72°C –40 s for extension, with a final extension step for 5 min at 720C. Negative control was also kept every time, containing no DNA template to confirm potential contamination especially in the PCR reagents. A 3 µl amplified PCR product was run on 1.2% agarose gel in 1X TAE at 100 V (Jordan Scientific) for 50 min. The gel picture was captured under UV light using the gel documentation unit (ProteinSimple, AlphaImager). The confirmed amplified PCR products were sent to AgriGenome (Kochi, India) for further purification and sequencing6,26.

Genetic group determination

The mtCOI sequences obtained from the own collections were compared with the global Bemisia genetic group datasets21,22. The mtCOI sequences were manually inspected for removal of putative pseudogenes along with ambiguous sites; gap adjustment and trimming of overhangs were carried out using BioEdit v7.2. The aligned sequences were subjected to a basic local alignment search (BLASTn) to confirm species identity23. Newly characterised haplotype sequences from our study were submitted to NCBI, and accession numbers were obtained. The sequences were aligned with the Bemisia genetic group database using the ClustalW program with default parameters in MEGA X (https://www.megasoftware.net/dload_win_gui). The phylogenetic tree was constructed using the maximum likelihood approach, and the sequences clade together with the Asia II 1 genetic group were selected for further studies.

Preparation of datasets

The accessions grouped under Asia II 1 were selected and looked further for their sequence length. The retrieved Asia II 1 sequences of short fragment length below 658 base pairs were discarded. A total of 676 Asia II 1 sequences of Asia Viz., India (n = 190), Pakistan (n = 396), China (n = 30), Nepal (n = 18), Bangladesh (n = 17), Thailand (n = 15), Taiwan (n = 3), and Vietnam (n = 7) were used in the present investigation and are available in Suppl. File 2. The total number of available B. tabaci sequences (mtCOI) from India in GenBank up to 1st September, 2024 are 2906 accessions. Later, these 2906 accessions were analysed generally for their phylogeny. The accessions branching under Asia II 1 were selected again and looked further for their sequence length. The 626 Indian sequences were categorised into Asia II 1, but more than 50% of the retrieved Asia II 1 sequences were junked because of their short fragment length below 658 base pairs (standard cut-off length of sequence with respect to mtCOI). Similarly, a total of 168 good-length sequences of Asia II 1 belonging to India were retrieved and are supplemented with the 22 haplotype sequences generated from our own lab studies (Suppl. File 1; Table S1). The pooled 190 Asia II 1 sequences belonging to India are used in this investigation and are available in Suppl. File 2. The 22 characterized haplotype sequences from our lab studies (Suppl. File 1; Table S1) were supplemented with 168 Asia II I sequences representing India alone were used to generate a phylogenetic tree along with Asia I (GQ281714.1 EU192044.1), Asia II 1 (HM137326, EU192047, GU585369, FJ802389, DQ174519, GU585372), Asia II 5 (AJ748376, AF418666), Asia II 7 (AY686064.1, DQ116660.1), Asia II 8(AJ748358.1, GQ281733.1) and Bemisia species viz., Bemisia afer (GQ139515.1), Bemisia atriplex (GU086362.1), Bemisia subdecipiens (GU220056.1), Trialeurodes vaporariorum (AF418672.2) as an ingroup and outgroup respectively for phylogeny analysis.

The Indian Asia II 1 data set was subdivided into subsample groups like Central India, East India, West India, North India, South India, and Others India (Asia II 1 species of India where collection region/zone has not been mentioned in GenBank) according to geographical region of the country (Supplementary File 3) for analysing comparative genetic variability and haplotype diversity.

Similarly, the Indian Asia II 1 data set from our study was again subdivided into subsample groups according to hostplant families from which B. tabaci is collected, such as Malvaceae, Fabaceae, Solanaceae, Asteraceae, Convolvulaceae, Cucurbitaceae, Euphorbiaceae, and Moraceae, and the accessions that have not been mentioned with hostplant sources in GenBank were denoted as NA (Supplementary File 3) for analysing comparative genetic variability and haplotype diversity.

Phylogenetic structure and haplotype analysis of B. tabaci Asia II 1 sequences

The phylogenetic structure and haplotype analysis of the Asia II 1 genetic group of B. tabaci, prevalent in Asia and India, was investigated. GenBank accessions corresponding to the Asia II 1 genetic group from Asian countries, including India, were retrieved from the NCBI database up until September 1, 2024. These sequences were analysed to confirm phylogenetic relationships using the maximum likelihood method in MEGA X. Genetic distances were calculated using both the P-distance and Kimura 2-parameter (K2P) models in MEGA X34.

Estimation of haplotype diversity

For the haplotype analysis of Asia II 1, the number of haplotypes (H), haplotype diversity (Hd), nucleotide diversity (pi), average number of nucleotide differences (k), total number of mutations (Eta), and G + C content were calculated using the software package DnaSP v6.10.01 (http://www.ub.edu/dnasp/)35. Neutrality tests like Tajima’s D, Fu and Li’s F, and Fu and Li’s D were performed to detect the deviation from the neutral model of evolution using DnaSP v6.10.01. The minimum spanning network of haplotypes was constructed using the PopART version 1.7 package implemented in TCS 1.21 (https://popart.maths.otago.ac.nz/download/)36. To assess differences between and among population groups, analysis of molecular variance (AMOVA) was carried out using the PopART program (version 1.7), which was included in the TCS 1.21.

Results

Genetic structure and haplotype analysis of B. tabaci genetic group Asia II 1 from Asia

A total of 676 curated mtCOI sequences from the B. tabaci Asia II 1 genetic group, originating from Asia, were used for haplotype analysis. This analysis generated 241 distinct haplotypes, of which 195 were singletons. Among the singletons, 117 were found in Pakistan and 48 in India. Other singletons were distributed across China (10), Bangladesh (7), Thailand (6), Nepal (5), and Vietnam (2) (see Supplementary File 3). In addition, 46 haplotypes were shared by at least two sequences. Haplotype H1 was the most dominant, with a frequency of 116 sequences. The second most prevalent was Haplotype H17, shared by 103 sequences, followed by Haplotype H4 with 52 sequences and Haplotype H2 with 45 sequences.

Haplotype H1 exhibited a network connecting 98 sequences from Pakistan, 14 from India, 3 from Bangladesh, and 1 from Nepal. Haplotype H17, with 103 sequences, formed a network comprising 45 sequences from India, 29 from Pakistan, 11 from China, 9 from Thailand, 3 each from Bangladesh and Taiwan, 2 from Vietnam, and 1 from Nepal. The minimum spanning network identified Haplotype H1 as the most widely distributed and ancestral haplotype cluster for the Asia II 1 population of B. tabaci in Asia, followed by Haplotypes H17, H4, and H2 (Figs. 1, Fig. 2A). Based on the unique and shared haplotype numbers in Pakistan and India, the data suggest that Asia II 1 is most prevalent in these two countries, with frequent occurrences of genetic group and haplotype outbreaks.

Fig. 1
figure 1

Geographical distribution of B. tabaci Asia II 1 species across the world and India. The presence of haplotype diversity among these populations across different geographical locations is indicated in multiple colors (The map was generated using online software tool MapChart [https://www.mapchart.net/]).

Fig. 2
figure 2

(A) Minimum spanning network from mtCOI sequences of Asia II 1 haplotypes of B. tabaci belongs to Asian Countries using PopART (Population analysis with reticulate trees) software. 2(B). Minimum spanning network from mtCOI sequences of Asia II 1 haplotypes of B. tabaci India using PopART (Population analysis with reticulate trees) software. 2(C). Minimum spanning network from mtCOI sequences of Asia II 1 haplotypes of B. tabaci belongs to different hostplant families using PopART (Population analysis with reticulate trees) software.

The haplotype diversity of genetic group Asia II 1 concerning Asian countries was found to be higher in Bangladesh (Hd: 0.949), followed by India (Hd: 0.926), Pakistan (Hd: 0.915), Vietnam (Hd: 0.905), Nepal (Hd: 0.856), China (Hd: 0.848), and Thailand (Hd: 0.657). Nucleotide diversity of Asia II 1 sequences from Pakistan was observed to be the highest (pi: 0.03270), followed by India (pi: 0.00763), Bangladesh (pi: 0.00694), Nepal (pi: 0.00585), China (pi: 0.00533), Thailand (pi: 0.00394), and Vietnam (pi: 0.00101) (Table 1). Particularly, the sequences belonging to Taiwan show zero values for nucleotide diversity and haplotype diversity, suggesting that there is no haplotype and nucleotide variation among the sequences and they are 100 percent similar. The number of haplotypes (H), average number of nucleotide differences (k) and total number of mutations (Eta) are represented in Table 1.

Table 1 Haplotype diversity (Hd), Nucleotide diversity (pi) and Neutrality tests of Asia II 1 from Asian countries.

The results of AMOVA revealed that, out of the total genetic variation in the mtCOI gene of the Asia II 1 genetic group of B. tabaci populations of Asia, a total 1.93% variation occurred among the populations, whereas the genetic variation within the populations was 98.06% and the FST value observed was 0.01937 (Table 2). The results of a population genetic study utilising neutrality tests in Asian countries revealed negative values for neutrality tests such as Tajima’s D, Fu and Li’s F, and Fu and Li’s D, indicating probable recent population expansion in all of the analysed Asian countries. Tajima’s D, Fu and Li’s F, and Fu and Li’s D neutrality test results were represented in Table 2. Tajima’s D with a negative value or value less than zero implies that the population of interest, i.e., the Asia II 1 genetic group of Asia, is anticipated to experience recent selective sweeps or population expansion following the bottleneck. Negative values of Fu and Li’s F and Fu and Li’s D, on the other hand, show excess allele numbers, most probably driven by recent population growth.

Table 2 Analysis of molecular variance (AMOVA) for the mtCOI sequences of Asia II 1 from Asia, India and different host families from India.

Genetic structure and haplotype analysis of B. tabaci Asia II 1 from India

A total of 22 mtCOI gene sequences obtained from our own collection (Suppl. File 1; Table S1) were blended with 168 sequences of mtCOI from GenBank to achieve a final alignment of 190 sequences with 658 bp in length. The phylogenic tree for all the mtCOI sequences of B. tabaci genetic group Asia II 1 was constructed by involving outgroups and other cryptic species. Looking into the topology of the phylogenetic tree, a higher percentage of sequence similarities were found within the Asia II 1 genetic group and were branched in a single clade where outgroups and other genetic groups branched separately (Fig. 3). Various colours in the phylogenetic tree indicate populations of different geographical regions: Red-North India, Brown-West, and Dark Green-Central India Light green (other parts of India or collection region/zone information is not available), Sky Blue-South, Dark Blue-Asia II 1 Global, Amber-East, Light Blue- genetic groups other than Asia II 1, and Purple denote outgroups.

Fig. 3
figure 3

Phylogenetic tree of mtCOI gene haplotypes of Asia II 1 genetic groups of B. tabaci belongs to India including some outgroups and genetic groups of B. tabaci using maximum likelihood approach in MEGA X. Various colors in the phylogenetic tree indicated different populations of various geographical regions. Red-North India, Brown-West, Dark Green-Central India Light green- Others India (Asia II 1 species of India where collection region/zone has not mentioned), Sky Blue- South, Dark Blue- Asia II 1 Global, Amber-East, Light Blue- genetic groups (Genetic groups other than Asia II 1) and Purple-Outgroups.

A colour-coded pairwise identity matrix for similarity scores of haplotypes of Asia II 1 genetic groups of B. tabaci India merging with some outgroups and genetic groups of B. tabaci using the sequence demarcation tool (SDTv1.2 [http://web.cbio.uct.ac.za/SDT]) is represented in Fig. 4. Each coloured cell represents a percentage identity score between the Asia II 1 sequences (one indicated horizontally to the left and the other vertically at the bottom). A coloured key indicates the correspondence between pairwise identities and the colours displayed in the matrix. SDT yielded identical pairwise identity score distributions for the majority of the Asia II 1 datasets by sharing between 90 and 100% pairwise identity, and distinct differences between the sequences were clearly observable in the coloured matrix for sequence pairs sharing less percent identity.

Fig. 4
figure 4

Color coded pairwise identity matrix for similarity scores of haplotypes of Asia II 1 genetic groups of B. tabaci India merging with some outgroups and genetic groups of B. tabaci using Sequence Demarcation tool Version 1.2 (SDTv1.2).

A total of 77 haplotypes (sequence variants) were identified in the Indian Asia II 1 dataset. There are 23 haplotypes shared by a minimum of 2 to a maximum of 44 sequent variants, and the remaining33 haplotypes were unique. Haplotype H5 is dominant with a frequency of 44 sequences. Supplementary File 3 depicts collection localities and other details about Haplotype 5. There are two other dominant haplotypes, namely Haplotype H8 and Haplotype H1, represented by 20 and 14 sequences, respectively. A total of 13 Asia II 1 accessions out of 22 sequences obtained from own lab collection trips were grouped with Haplotype H5, whereas two of them were grouped with H25, one with H8, and one sequence of own collection from Banswara, Punjab (MN830429.1) was found unique as H36 (Fig. 2B).

Circles in the TCS haplotype network tree for the mtCOI haplotypes of B. tabaci depict the haplotypes identified. The size of each circle is proportional to the frequency of the haplotypes. The lines between each haplotype represented the mutations; each single line counted as a single mutation. Outgroup sequences and cryptic groups were not included in the minimum spanning network due to the level of sequence divergence above 3.5%. The minimum spanning network shows that Haplotype-5 (H5) is the most common as well as the most ancestral haplotype cluster for the population of Asia II 1 B. tabaci in India, followed by H-8 and H-1 (Suppl. File 1; Fig. S2). The dominant cluster H5 shares the majority of the sequences from North India, Central India, and Others (location not specified in NCBI), where cotton covers the major area under cultivation. The highest haplotype diversity was observed in North India (Hd: 0.997), followed by South India (Hd: 0.977), Central India (Hd: 0.927), West India (Hd: 0.889), and East India (Hd: 0.700), where nucleotide diversity was observed highest for South India (pi: 0.0099), followed by West India (pi: 0.00851), North India (pi: 0.00748), East India (pi: 0.00121), and Central India (pi: 0.00797) as per the represented in Table 3; Fig. 1.

Table 3 Haplotype diversity (Hd), Nucleotide diversity (pi) and Neutrality tests of Asia II 1 India.

The AMOVA revealed that out of the total genetic variation in the mtCOI gene of the Asia II 1 genetic group of B. tabaci populations from different regions of India, a total 12.02% variation occurred among the populations, whereas the genetic variation within the populations was 87.97% and the FST value observed was 0.12028 (Table 2). Tajima’s D neutrality test was significant with positive values for the Central India population and non-significant with negative values for East, North, South, West, and Others (Table 3). Tajima’s D, which is a measure of deviation from neutral evolution where Tajima’s D value of less than or greater than two is usually seen as a strong indicator that a gene is not evolving in a neutral manner. A selective sweep or positive selection is indicated by an abundance of rare alleles in genes with a Tajima’s D value less than − 2. However, an excess of common alleles indicative of balancing selection prevails in genes with a Tajima’s D value greater than 2. Fu and Li’s D and Fu and Li’s F neutrality tests resulted in non-significant negative values for all the population in collected regions (Table 3). A substantial negative Fu and Li’s value indicates an excess of rare mutations or singleton mutations relative to the neutral expectation. This might indicate population expansion or positive selection at linked sites.

Genetic structure and haplotype analysis of B. tabaci Asia II 1 on different hostplant families

When the topology of the phylogenetic tree was explored, a higher percentage of sequence similarities were observed within the Asia II 1 genetic group from different hostplant families and were branched in a single clade, whereas outgroups branched separately. The distinct colours in the phylogenetic tree show distinct hostplant families from which B. tabaci is collected for haplotype analysis from India (Suppl. File 1; Fig. S2).

When the Asia II 1 sequences from India were grouped by respective host plant families, a total of 189 sequences were analyzed, yielding 79 haplotypes (Supplementary Information 3). Haplotype H10 was the most predominant, clustering with 44 sequences, followed by H19 and H34 with 19 and 11 sequences, respectively. Nineteen haplotypes were shared by at least two and up to 44 sequence variants, while the remaining 60 were unique singletons. A minimum spanning network of the host plant families revealed that Haplotype H10 was the most common and widespread cluster within the Asia II 1 genetic group in India, with 17 sequences from the Malvaceae family. Haplotype H34 followed, with 9 Malvaceae sequences, and Haplotype H19, with 6 Malvaceae sequences. Additionally, there were 24 singleton Malvaceae haplotypes (Fig. 2C). This network demonstrated that the Malvaceae family induces greater haplotype variation among the host plant families of B. tabaci, followed by the Fabaceae family. The higher number of haplotypes associated with the Malvaceae family supports the notion that host plants have a significant influence on haplotype and genetic group development.

Although the highest number of haplotypes were observed in Malvaceae, the haplotype diversity was observed to be the highest in Cucurbitaceae (Hd: 0.964), followed by Fabaceae (Hd: 0.919), Malvaceae (Hd: 0.915), and Solanaceae (Hd: 0.881). Nucleotide diversity of Cucurbitaceae is highest (pi: 0.00939), followed by Fabaceae (pi: 0.00812), Malvaceae (pi: 0.00736), and Solanaceae (pi: 0.00525). The number of haplotypes (H), average number of nucleotide differences (k), total number of mutations (Eta), and G + C content are represented in Table 4. The AMOVA for the total genetic variation in the mtCOI gene of the Asia II 1 genetic group of B. tabaci populations of India from various hostplant families shows a total of 13.52% variation among the populations, whereas the genetic variation within the populations observed was 86.47% with the FST value of 0.1352 (Table 2). The values of neutrality tests, viz., Tajima’s D, Fu and Li’s F, and Fu and Li’s D, were represented in Table 4. All the neutrality tests showed negative values, indicating plausible recent population expansion in all of the analysed groups.

Table 4 Haplotype diversity (Hd), Nucleotide diversity (pi) and Neutrality tests of Asia II 1 from different hostplant families of B. tabaci.

Discussion

The genetic diversity and population structure of insect pests inhabiting farmscapes can be influenced by various factors, particularly host plants and local agricultural practices37,38,39. Ecology of B. tabaci relies heavily on dispersal, which not only accounts for host finding and colonisation in constantly changing land cover but also aids in the distribution of desirable genetic traits like insecticide resistance among populations40,41. The identification of new species within B. tabaci was facilitated by sequencing a 657-bp fragment of the mtCOI gene, leading to the introduction of the term "cryptic species complex"16,17. Dinsdale et al.,16 proposed a 3.5% genetic boundary for distinguishing species within the B. tabaci complex. However, Lee et al.,42 suggested that a 4.0% genetic boundary was more appropriate for distinguishing species within the B. tabaci complex than 3.5%. Reclassification based on this 4% genetic divergence revealed the presence of 42 distinct genetic groups. Further global analysis of B. tabaci mtCOI sequences by Kanakala and Ghanim22 reported 44 genetic groups worldwide. Subsequently, Rehman et al.,21 updated the number of genetic groups to 46 based on additional analysis.

The current update of genetic groups includes Africa, Asia I, Asia I-India, Asia II 1 to 13, Spain 1, Asia III, Asia IV, Asia V, Australia, Australia/Indonesia, China 1 to 5, Indian Ocean, Ru, Middle East Asia Minor I and II (MEAM), Mediterranean (MED), MEAM K, New World 1and 2, Japan 1 and 2, Uganda, Italy 1 and Sub-Saharan Africa 1 to 515,17,22,43,44,45. India has reported with 13 genetic groups namely, viz., Asia I, Asia I -India, Asia II 1, Asia II 5, Asia II 6, Asia II 7, Asia II 8, Asia II 11, Asia II 13, Middle East Asia Minor (MEAM)-1, MEAM-K, China 3 and China 721,23.

Among the genetic groups widely distributed across the country, Asia II 1 is predominant, exhibiting national distribution with the highest haplotype diversity. This genetic group is closely associated with cotton leaf curl viruses in specific locations. A key concern with the Asia II 1 group is its potential to expand its distribution, replacing previously established genetic groups in cotton and other agro-ecosystems23,29. This pattern aligns with the findings of Mahmood et al.,46, who reported that Asia II 1 is dominant in Pakistan and the neighboring northern zone of India. Interestingly, Asia II 1 also exhibits dominance in the central zone of India, further highlighting its significant presence in these regions.

The expansion of the Asia II 1 population on cotton in the southern and central regions of Inida may have occurred through the displacement of the Asia I genetic group, as seen in Pakistan47. This situation requires close monitoring to prevent a repeat of what occurred with tomato crops, where the Asia I and Asia II 7 genetic groups were displaced by the more dangerous MEAM-1 genetic group48. Consequently, it is crucial to monitor and assess specific genetic groups of whiteflies for key factors such as fecundity, survival potential, insecticide resistance, and susceptibility to biocontrol agents. This would help predict population growth and the likelihood of genetic group displacement28,29,47,48,49,50,51. Despite the significance of these developments, research on Asian genetic groups particularly B. tabaci Asia II 1 remains limited. This gap underscores the need for further studies to better understand these populations and their potential impact.

The present study reveals significant genetic diversity differences among B. tabaci Asia II 1 sequences from Asian countries, with Pakistan and India showing higher haplotype and nucleotide diversity than others. AMOVA and neutrality tests suggest recent selective sweeps or population expansion within the Asia II 1 group. A recent study identified 31 native and invasive B. tabaci species across 16 Asian countries, with Asia II 1 reported in 10, including India, Pakistan, Bangladesh, Nepal, China, and Japan22,52. Asia II 1 has progressively replaced Asia I in Pakistan’s Punjab and Sindh provinces and now dominates in northern and central India as well46. Additionally, Asia II 1 has found in Southeast Asian nations like Vietnam, Thailand, and Cambodia but is absent from Malaysia46. The study also notes that Asia I has disappeared from Pakistan since 2012, highlighting Asia II 1 dominance46.

An earlier study from India observed that Asia II 1 was predominantly distributed in northern India53. However, results from the current investigation revealed the dominance of Asia II 1 in the northern and central zones of India and the expansion of Asia II 1 to southern India as well. The dominant haplotype, H5, also shares the majority of the sequences from North and Central India. The haplotype diversity was observed to be high in North India, whereas nucleotide diversity was higher in South India. While Asia 1, Asia II 5, and Asia II 8 had a predominant occurrence in Southern India, whereas Asia II 5, Asia I, and Asia II 1 were more prevalent in North-Eastern India21. According to a study conducted on 73 populations that were gathered from different parts of northeastern India, 27 of the 73 populations belonged to Asia II 5, 26 to Asia I, 15 populations belonged to Asia II 1, and 4 were grouped as Asia II 721. Asia II 1 and Asia 1 species of B. tabaci have acquired rapid resistance to organophosphates (Chlorpyrifos, etc.) and pyrethroids (Deltamethrin, etc.) as a result of the extensive application of insecticides54. The insecticide resistance capability of Asia II 1 species might have been the driving force for the expansion of Asia II 1 in India54.

Recent findings from a larger survey in India indicate that Asia II 1 predominates in the northern region of the country, reflecting the circumstances in Pakistan. In other south Asian countries like Bangladesh and Nepal, they have a similar situation like India where Asia 1 and Asia II 5 acts as a dominant species followed by Asia II 1 respectively resulting in the dominance of Asia II 1 in other countries. This seems Asia II 1 is being dominant in other regions of the world46,55,56. The study highlights significant genetic diversity within the B. tabaci Asia II 1 genetic group across various Asian countries, with Pakistan and India emerging as key regions for its spread and prevalence. Haplotype H1, dominant in Pakistan, and H17, prevalent in India, underscore the central role of these two nations in shaping the population dynamics of Asia II 1. Other countries, such as Nepal and Bangladesh, also exhibited considerable genetic variation, while Taiwan showed no diversity, indicating complete sequence similarity. The majority of genetic variation was found within populations rather than between them, suggesting extensive gene flow across the region. Additionally, the negative neutrality test values across all countries point to recent population expansion or selective sweeps in the Asia II 1 group (Suppl. File 1; Fig. S2). These findings underscore the dominance of Asia II 1 in Pakistan and India and highlight the need for continued monitoring to understand its evolutionary trajectory and impact on agriculture throughout Asia. The higher haplotype diversity within the Asia II 1 genetic group might be due several agro-ecological factors including predominant use of different insecticide in different locations and the different restiance mechanism induced in these populations46,57,58,59.

The results from the present investigation also revealed that the Malvaceae family induces more haplotype variation among hostplant families, followed by the Fabaceae family, even though the nucleotide and haplotype diversity was found to be high in Cucurbitaceae. This study also outlines the influence of geographic location on the dominance of haplotypes. Asia II 1 is an indigenous species that can be found throughout Southeast Asia and the Indian subcontinent60. According to a comparative study on host suitability, Asia II 1 and MEAM1 performed well on all hosts in a similar fashion, while Asia II 1 thrived best on cotton plants and had a higher lifespan and fecundity rate on tomato plants61. After sequencing the nuclear genome of Asia II 1, about 1294 genes with high-impact variants were found. The functional analysis showed that some of the genes are involved in the transmission of viruses, such as 4 genes in the transmission of the tomato yellow leaf curl virus (TYLCV), 96 genes in the transmission of the tomato crinivirus (ToCV), and 14 genes in pesticide resistance60. One particular study correlated the incidence of B. tabaci Asia II 1 with the incidence of CLCuD29. A direct assessment of the cotton leaf curl Multan virus (CLCuMuV) transmission efficiency in four cryptic species, viz., two native (Asia 1 and Asia II 1) and two invading (MEAM1 and MED) species, reported that Asia II 1 was the species that transmits CLCuMuV the most efficiently61,62.

Strengths and limitations

The study’s strengths lie in its comprehensive genetic analysis of B. tabaci Asia II 1, providing valuable insights into its haplotype diversity, geographic influence, and host plant interactions, which are crucial for pest management. However, limitations include incomplete regional data, particularly in Southern and Central India, insufficient exploration of host plant interactions, and a narrow focus on the Asia II 1 genetic group without considering its interactions with other genetic groups. Additionally, the study lacks quantitative assessment of environmental factors affecting pest dominance.

Conclusion

The Asia II 1 genetic group of B. tabaci is increasingly dominant across Asia due to its extensive presence, damage potential, and ability to outcompete other species. This rapid expansion, particularly in countries like India and Pakistan, can be linked to dynamic changes in haplotype and nucleotide diversity, high gene flow, adaptability to various host plants, swift insecticide resistance, and competitive advantages. Its nuclear genome, enriched with genes for virus transmission and pesticide resistance, virus-induced behavioral changes, and co-evolution with specific host-virus-vector interactions, further contribute to its spread. Environmental factors like temperature, humidity, local agricultural practices, and mono-cropping of cotton also play a role in its dominance, posing a significant threat to existing genetic groups in agro-ecosystems.

The study identified 241 haplotypes from 676 mtCOI sequences of Asia II 1 from Asia, with Haplotype H1 (116 sequences) being the most prevalent. In India, 77 haplotypes were found from 190 sequences, with Haplotype H5 (44 sequences) dominant, and from 189 host plant sequences in India, 79 haplotypes were identified, with Haplotype H10 (44 accessions) predominant. These haplotypes, especially H1, H5, and H10, provide new insights into Asia II 1’s genetic diversity, suggesting multiple genetic lineages and evolutionary paths crucial for pest management and understanding its evolution.