Abstract
Biological control is considered the backbone of any integrated pest management programme. Exploring native ranges to identify new biological control agents is technically challenging and time-consuming; however, incorporating biogeographical insights into the selection process enhances the likelihood of success in biological control efforts. The zigzag beetle, Cheilomenes sexmaculata (F.), the most important and abundant of all ladybird beetles, feeds on a diverse range of prey. To study the population structure of C. sexmaculata, individuals were collected from five different zones, consisting of 25 subpopulations (subsets from their respective main zones) that uniformly represented India. We identified 109,585 contigs, 304,784 SNPs, and 46,583 INDELs, among which 5 INDELs and 9 SSRs were polymorphic. We observed high genetic diversity parameters for populations collected from the West Zone in all SNPs, SSRs, and INDELs markers. However, gene flow (Nm) was restricted, and FST was high in the South Zone and East Population (FST=0.45) populations, while the populations from the Central, North and West zones recorded zero FST. The structure and dendrogram construction analysis revealed individuals from Coorg and Mudigere of the South Zone and Barapani of the East Zone populations formed separate clusters. Mantel correlation revealed genetic distances decreased with the increasing geographical distance SNPs (Rxy= − 0.184 and P = 0.07); INDELs (Rxy= − 0.204 and P = 0.05) and SSRs (Rxy= − 0.339 and P = 0.06) indicated moderate genetic differentiation in the South and East zones populations due to geographic barriers, genetic drift and local environmental conditions. Populations in the North, West, and Central zones showed high gene flow, likely due to modern domestication and human activities. Additionally, the findings indicated that the populations from the West, North, and Central zones are better strains due to their high genetic fitness, so these populations can be used for classical and augmentative biological control.
Similar content being viewed by others
Introduction
The distribution of a predator is a key factor in shaping the structure and dynamics of an ecosystem. It determines where and how strongly prey populations are affected, influencing the local abundance and diversity of prey species. Most importantly, it plays a vital role in maintaining the overall balance of the food web through a process called top-down control1,2. Exploring native ranges to identify new biological control agents is technically challenging and time-consuming, yet it forms the foundation of a successful biological research program1. Agents exhibiting high host specificity and damage efficacy during field surveys are prioritized. However, logistical and technical constraints often limit the collection of potentially beneficial natural enemies to a small number of source populations. Selecting the appropriate natural enemy population for host range testing and eventual release is crucial, as insect species with broad geographical distributions are often genetically differentiated2,3. Understanding these biogeographical population differences can help predict variations in an agent’s climatic adaptations or host-related traits4, thereby guiding the selection of a more effective and safe biological control agent.
Coccinellid beetles are an important group of predators in agri-horticultural ecosystems, holding significant economic importance due to their practical use in augmentative and classical biological control programs5. Coccinellidae includes approximately 6000 described species across 360 genera and 42 tribes6. The zigzag beetle, Cheilomenes sexmaculata (F.), (Coleoptera: Coccinellidae) commonly known as the six-spotted ladybird beetle is the most important, abundant, and common ladybird beetle native to the oriental region, with a wide distribution in India, Southwestern Asia, Indonesia, the Philippines, Australia, South Africa, and Pakistan7. The C. sexmaculata is a generalist predator known to prey on 70 different species of aphids, as well as various species of mealybugs, scale insects, thrips, and whiteflies. Both its adult and larval stages contribute significantly to suppressing pest populations through active predation. Because of its commercial importance in augmentative biocontrol programme, the biological parameters were widely studied and well known. However, the genetic variability and population variations were poorly studied in India. The population genetics, a subfield of genetics addresses genetic differences within and between populations. It can assess population structure, estimate gene flow, genetic diversity, and genetic variance, and analyze dispersal patterns and migration behavior8,9. It also determines whether a population is in Hardy-Weinberg equilibrium or not10. A powerful application of population genetics in biocontrol is monitoring of agents released into the field, allowing for the monitoring of their genetic variation, field performance and ecological risk.
Phylogeography studies the historical processes that shape the contemporary geographic distributions of individuals11. Coccinellids have served as genetic models since the early 20th century for investigating genetic bottlenecks in biodiversity12. The genome size (C- value) of coccinellids ranges from 0.19 to 1.71 pg, with an average of 0.53 pg13. Despite recent advances in population genetics, phylogeographic studies of coccinellid beetles remain in infancy. In India, there has yet to be any research on the distribution patterns, migratory behavior, genetic diversity, genetic variance, and phylogeography of coccinellids. There is little direct evidence to test the phylogeography of this species at the ecological or molecular level. Among DNA markers, Single Nucleotide Polymorphisms (SNPs) and Simple Sequence Repeat (SSR) markers are particularly valuable for diversity analysis due to their co-dominant nature, high level of polymorphism, and suitability for high-throughput analysis14. In this study, we attempted to analyze the genetic structure and migration patterns of different populations. Further molecular investigations were needed to illuminate the evolutionary pathways of coccinellid beetles for practical use in bio-intensive management programs. With these objectives in mind, we intended to develop specific INDEL and SSR markers for C. sexmaculata and assessed the population structure and genetic diversity using SSR markers.
Results
Sequencing and development of INDELs and SSRs markers to distinguish individuals from different subpopulations
In this study, we sequenced 25 individuals from 5 zones, including 25 subpopulations (a specific individuals of C. sexmaculata collected from different locality or zones), resulting in 109,585 contigs extracted from 1.28 GB of whole genome data. We identified 304,784 SNPs and 46,583 INDELs using a Minor Allele Frequency (MAF) threshold of 0.05 and Polymorphic Information Content (PIC), then reduced the marker load to 30,000 SNPs. Using DINDEL software version 2.0, we selected the best 3453 informative INDELs, and with MISA software version 2.1, we identified 933 informative SSRs from the contigs. Primer 3 web-based program was used to design 8 INDELs and 13 SSRs markers. Of these, 7 INDELs and 10 SSRs were successfully amplified for validation on agarose gel, with 5 INDELs and 9 SSRs found to be polymorphic. Scoring was based on the presence or absence of bands, which was further used for diversity and structure analysis (Figs. 1 and 2) (Tables 1 and 2) (Supplementary Fig. 1).
Genetic diversity parameters using different markers
The number of alleles (Na) varied from 1.5 to 1.95; the highest was recorded for the North zone population (1.95), and the lowest was recorded for the South (1.5) and East (1.5) zone populations. Similarly, several effective alleles (Ne), which mainly contributed to heterozygosity, ranged from 1.4 to 1.8, and Shannon’s Information Index (I) ranged from 0.40 to 0.63; the maximum was recorded for the West zone population and the minimum for South and East zone populations. Genetic variability is also measured as the amount of actual or potential heterozygosity. Observed heterozygosity (Ho) was higher than the expected heterozygosity (He) at North (0.27 > 0.22), West (0.34 > 0.32) and Central zones (0.27 > 0.25) populations whereas for South (0.26 > 0.42) and East zone (0.28 > 0.41) populations observed heterozygosity (Ho) was lower than the expected heterozygosity (He) using SNPs (Table 3).
The genetic diversity of INDELs and SSRs was found to be high. Out of seven, five markers and 9 out of 10 markers were polymorphic, respectively. A total of 20 alleles were detected across the 5 loci in both markers (Table 3). In INDELs Marker CsSTS77210, the highest number of observed alleles per locus (3.8) was found, while CsSTS73540 and CsSTS72451 showed the lowest (1.6) (Table 4). The Na amplified per SSR locus with an average of 1.98; the Ne with an average of 1.67; The PIC value ranged from 0 (CsM29435) to 0.48 (CsM37073) (Table 3). In INDELs, an average value of 0.20% (Table 4) was observed, while SSRs showed high information about codominant in nature so that easily distinguish between homozygous and heterozygous (PIC > 0.30%). This once again indicated higher genetic diversity in these populations (Table 3).
Departure/variations from hardy–Weinberg equilibrium (HWE) in INDELs and SSRs
In the Central, West, and Central zones, two of the 25 subpopulations loci combinations showed no significant departures from Hardy-Weinberg equilibrium (HWE). In the East Zone population, observed heterozygosity was less than expected for only 3 of 7 loci, which significantly deviated from HWE, while the remaining 4 loci were monomorphic. In the South Zone population, a further reduction in heterozygosity (Ho < He; 0.4 < 0.2) was observed, with only CsSTS49665 marginally deviating from HWE and all other loci being monomorphic (Table 5). In the SSR analysis, for the Central, North, and West zone populations, 9 out of the 10 analyzed microsatellite loci showed a significant deviation from HWE, except for the monomorphic CsM29435 marker. In the East zone, five loci met the criteria for deviation from HWE, with four markers being highly significant (P < 0.001) and CsM43755 being slightly significant (P < 0.05), while the remaining five markers were monomorphic (Table 5). In the South zone populations, 5 out of 10 loci showed levels of departure from HWE, with CsM48497 being moderately significant (P < 0.01) and four markers (CsM12479, CsM92789, CsM30177, and CsM37073) being least significant (P < 0.005). The remaining five markers (CsM43755, CsM29435, CsM45955, CsM81952, and CsM38056) were monomorphic (Table 5).
Genetic differentiation using F-statistics, analysis of molecular variance (AMOVA) and population structure analysis for different populations using different markers
In SNPs INDELs and SSRs, gene flow (Nm) ranged from 0.963 (West Zone) to 0.304 (East Zone). In North, West and Central populations had negative FIS, showed excess outbreeding with high gene flow (Nm). While, the remaining East and South zone populations had positive FIS, leading to inbreeding among individuals within populations and corresponding to low gene flow (Nm) (Table 6). Similar results were obtained in INDELs and SSRs (Table 6). AMOVA indicated significant genetic differentiation at all population levels. For SNPs, the variation between populations was low at 17%, while the variation within populations was high at 83%. Similarly, for INDELs and SSRs markers, 26% and 38% of the variation was due to differences between populations, respectively, while 74% and 62% were due to differences among individuals within populations (Table 7). We previously suspected that the South population influenced genetic differentiation for INDELs and SSRs based on the number of alleles (Na), effective number of alleles (Ne), observed heterozygosity (Ho), and the significance level of deviation from Hardy-Weinberg equilibrium (HWE). To address this, we excluded the South population from our analysis and reanalyzed the data. Interestingly, the within-population variation reduced to 60% and 47%, with 14% and 15% of the variation primarily contributed by the South Zone population alone (Table 7).
We calculated the most appropriate K value for 100 individuals from 25 subpopulations and obtained a peak ΔK = 3, forming three clusters. The first cluster included two populations (Coorg and Mudigere) from the South Zone and one population (Barapani) from the East Zone. The second cluster comprised a single population from the Central Zone. The third and largest group consisted of four populations from the North Zone, three from the South Zone, and four from the Central Zone (Fig. 3a). The clustering pattern was quite different for INDELs at ΔK = 3 (Fig. 3b) and SSRs at ΔK = 4 (Fig. 3c) since in SSRs more prominent grouping was observed for south and east zones subpopulations at ΔK = 4.
Dendrogram, Nei’s genetic distance and PCoA analysis for different populations using different markers
The NJ tree results indicated no geographical clustering, with the grouping being random and primarily forming two genetically distinct groups. Mudigere and Coorg in the South, along with Barapani in the East zones, formed one group, while populations from the North, Central, and West zones clustered separately (Fig. 4a). This finding was further supported by INDEL and SSR data, which showed a bootstrap value of 99. The dataset for SNP analysis differed because we selected five subpopulations from each zone/population for ddRAD sequencing, resulting in a total of 25 subpopulations. To validate the results, we included four individuals from each subpopulation, bringing the total to 100 individuals across all 25 subpopulations. This comprehensive approach provided strong support for several branches of the phylogenetic tree (Fig. 4b and c).
Dendrogram and genetic structure of five different populations with 25 subpopulations and branch colors indicate bootstrap support. (a). SNPs markers blue color indicating North zone subpopulations (DL1, DL3, DL4, DL5, DL6) with south zone subpopulations (KA5). Black colour indicating grouping of central zone subpopulations (NA3, NA4, NA5, MA2) with south (KA1) zone subpopulation. Red colour indicating south zone subpopulations (KA2, KA4, KA6) is grouping with East zone sub populations (WB1, WB2, WB3). Pink colour completely grouping of west zone populations (J12, J21, J31, J32, GN2) and Grey colour contains one subpopulation from west (J11) and one subpopulation from central (NA6) boot strap value = 100 (b). INDELs marker boot strap value = 99 (c). SSRs marker boot strap value = 99.
The largest Nei’s genetic distance in SNPs between two conspecific populations was 0.511, occurring between the North and West zone populations. The shortest genetic distance was observed between the South and East zone populations (0.171), indicated that populations with many similar alleles had small genetic distances. The pairwise population matrix of Nei’s genetic identity was highest between the South and East zones (0.868), showed a high proportion of identical genes in these populations. In contrast, it was lowest between the West and Central zones (0.600), indicated a higher proportion of dissimilar genes (Table 8), and similar patterns were observed in INDELs and SSRs markers (Table 9). Based on Nei’s genetic distance values, we constructed a Principal Coordinate Analysis (PCoA) for the three different markers, and the cumulative variance explained by the first two axes confirmed the similarities between the South and East zone populations despite their geographical distance (Fig. 5; Table 9).
Isolation by distance (IBD) for different populations using different markers
Isolation By Distance (IBD) analysis using SNPs, seven INDELs, and ten SSR markers revealed a positive correlation between geographic and genetic distances, with R² values of 0.317 (Rxy = 0.564, P = 0.35), 0.58 (Rxy = 0.763, P = 0.40), and 0.529 (Rxy = 0.728, P = 0.17), respectively (Fig. 6; Table 10). This indicates that genetic distance increases with geographic distance. Due to restricted gene flow, most genetic differentiation occurred in the South and East Zone populations. From the above analysis like genetic diversity parametetrs, AMOVA, Structure, Nei’s genetic distance, PCoA and Dendrogram construction are clearly showing major genetic differentiation present in the subpopulation of south and east zone so that when the South and East Zone populations were excluded, the analysis revealed negative correlations between geographic and genetic distances in the North, West, and Central zones. The results were as follows: SNPs (R² = 0.339, Rxy = − 0.184, P = 0.07), INDELs (R² = 0.326, Rxy = − 0.204, P = 0.05), and SSRs (R² = 0.530, Rxy = − 0.339, P = 0.06) (Fig. 7; Table 10).
Isolation by distance (IBD) for measuring geographical distance with genetic distance were calculated along with sum of squared deviations from the mean of X (SSx), sum of squared deviations from the mean of Y (SSy), sum of products of X and Y (SPxy), correlation coefficient between X and Y (Rxy) and probability (p) using different markers from 25 subpopulations.
Isolation by distance (IBD) using different markers excluding south and east zone populations for measuring geographical distance with genetic distance were calculated along with sum of squared deviations from the mean of X (SSx), sum of squared deviations from the mean of Y (SSy), sum of products of X and Y (SPxy), correlation coefficient between X and Y (Rxy) and probability (p) using different markers from 25 subpopulations.
Discussion
The results provided significant insights into the genetic diversity, structure, and gene flow of C. sexmaculata across different geographic zones in India. In this study, we sequenced 25 individuals from 5 zones of C. sexmaculata, identifying 304,784 SNPs and 46,583 INDELs from 1.28 GB of whole genome data, then reduced the marker load to 30,000 SNPs. These results aligned with reports by Yan et al.15,16,17, who assessed the genetic diversity of the Brown Marmorated Stink Bug (BMSB), Halyomorpha halys (Pentatomidae: Hemiptera), using a ddRAD sequencing approach across 12 countries on four continents. They generated 2 GB of raw sequencing data per individual, identifying 387,629 high-confidence SNPs from 389 adult BMSB individuals.
The number of alleles (Na) ranged from 1.5 in the South and East zones to 1.95 in the North zone, with effective alleles (Ne) ranging from 1.4 to 1.8 and Shannon’s Information Index (I) from 0.40 to 0.63. Observed heterozygosity (Ho) was higher than expected heterozygosity (He) in the North, West, and Central zones, but lower in the South and East zones. The presence of higher observed alleles indicated more diverse populations discriminated by INDELs and SSRs18. A higher Shannon information index showed a higher diversity level from genes and populations to whole species and ecosystems in INDELs and SSRs. Markers where observed heterozygosity was less than expected (Ho<He) indicated reduced heterozygotes, mainly from Wahlund’s effect on populations19. The Polymorphic Information Content (PIC) of a marker is a crucial factor in diversity analysis. Following the criteria of Botstein et al.20, none of the markers were observed to be highly informative (PIC > 0.5); 20% were found to be reasonably informative (0.25 < PIC < 0.5) and only 80% were slightly informative (PIC < 0.25) across the populations. Overall, in the North, Central, and East zone populations, higher observed heterozygosity compared to expected heterozygosity indicated excess outbreeding, leading to significant departures from HWE. In contrast, the East and South zone populations showed minimum deviation from HWE, primarily due to inbreeding and a higher number of homozygotes21.
Gene flow (Nm) for SNPs, INDELs, and SSRs ranged from 0.963 (West Zone) to 0.304 (East Zone). North, West, and Central populations had negative FIS values, which indicated excess outbreeding and high gene flow, while East and South zones had positive FIS values, due to the geographical barriers between subpopulations which suggested continuous inbreeding and restricted gene flow, enhances fixation of rare alleles in the event and led to high genetic differentiations among subpopulations. In all three markers (SNPs, INDELs, and SSRs), the East and South Zone populations shared similar alleles, while the remaining populations were fixed for different alleles. The FIS values were small but consistently negative, indicated an excess of heterozygotes. Gene flow between different geographical populations was limited, resulted in limited gene exchange and considerable genetic differentiation22. AMOVA showed significant genetic differentiation, with SNPs having 17% variation between populations and 83% within, and INDELs and SSRs showed 26% and 38% variation between populations, respectively. Excluding the South Zone population, reduced within-population variation to 60% and 47% for INDELs and SSRs, with the South Zone contributing 14% and 15% of the variation. These findings partially aligned with Leite et al.23, indicated that the high within-population diversity was due to the genetically diverse subpopulations in the South and East zones. The most appropriate K value for 100 individuals was ΔK = 3, formed three clusters. The first cluster included two populations from the South Zone and one from the East Zone, the second cluster had one population from the Central Zone, and the third cluster included populations from the North, South, and Central Zones. Similar clustering patterns were observed for INDELs at ΔK = 3 and SSRs at ΔK = 4. Our results partially corroborate with Palma et al.24, which suggested high levels of gene flow and low genetic differentiation between different populations, possibly indicated recent re-colonization. The similar topography across various regions and anthropological activities likely contributed to the high gene flow levels.
The NJ tree showed no geographical clustering, formed two genetically distinct groups: one with Mudigere and Coorg from the South and Barapani from the East, and the other with populations from the North, Central, and West zones. This grouping was supported by INDEL and SSR data. These results suggested that the genetic structure of C. sexmaculata in the South and East zones resulted from the complex interactions between various regional forces, such as differences in pest eradication programs, agricultural practices, climate, and geographical structure25,26. Due to the rarity of long-distance migration events, inbreeding among subpopulations in Mudigere, Coorg, and Barapani was prevalent27. The largest Nei’s genetic distance in SNPs was 0.511 between the North and West zone populations, while the shortest was 0.171 between the South and East zones, indicating many similar alleles. Principal Coordinate Analysis (PCoA) confirmed the similarities between the South and East zone populations despite their geographical distance. The North, West, and Central zone populations clustered homogeneously due to similar environments, which facilitated intraspecific gene flow and maintained genetic diversity among these populations18. IBD analysis using SNPs, seven INDELs, and ten SSR markers showed a positive correlation between geographic and genetic distances, with R² values of 0.317, 0.58, and 0.529, respectively, which indicated that the genetic distance increased with geographic distance. This indicated low genetic differentiation and high gene flow in these zones, with weak evidence for isolation by distance, as there was no significant correlation between geographic distance and genetic relatedness among populations19.
Conclusion
The present study on the population genetics and phylogeography of the six-spotted ladybird beetle, C. sexmaculata, provided significant insights into the genetic diversity, structure, and gene flow of this species across different geographic zones in India. The results revealed substantial genetic diversity within the populations, particularly in the South and East zones, which exhibited high levels of genetic differentiation and restricted gene flow. In contrast, the North, West, and Central zones demonstrated lower genetic differentiation and higher gene flow, which indicated a more homogeneous genetic structure. Analysis of molecular variance (AMOVA) highlighted significant genetic differentiation at both the inter- and intra-population levels, with a notable contribution from the South Zone population to overall genetic variation. STRUCTURE analysis identified three distinct genetic clusters, which supported the observed patterns of genetic differentiation and gene flow. The findings underscored the importance of considering genetic variability and population structure in the selection and implementation of biological control agents. The identified genetic clusters and the observed patterns of gene flow provided a foundation for future studies on the evolutionary pathways and ecological adaptations of C. sexmaculata. These insights would aid in the development of more effective and sustainable biological control strategies, contributing to the broader goals of bio-intensive pest management programs. This study represents the first report on the phylogeographic patterns, distribution, and migratory behavior of C. sexmaculata in India. The markers developed in this research would facilitate the analysis of population structure and genetic diversity in both C. sexmaculata and related Coccinellidae species.
Materials and methods
Sampling, DNA extraction, sequencing and PCR amplification
The adult C. sexmaculata were collected from five different localities belonging to various zones of India viz., Delhi (North), Nagpur (Central), Jorhat (East), Anand (West) and Bengaluru (South) from May to June 2019 (Table 11). Each adult was collected at least 3 m from the neighboring one to prevent over-representation of siblings from each locality. The samples were stored in 100% ethanol and frozen at − 70 °C. Genomic DNA was isolated from each individual separately by grinding in liquid nitrogen, using the method described earlier28,29.
Our study employed a restriction-site associated DNA sequencing (RADseq) due to no reference genome is required for sequencing and the degree of genome coverage can be adjusted by selecting various restriction enzymes, which uses restriction endonucleases (REs) to randomly sample the genome at locations adjacent to restriction-enzyme recognition sites. This, when coupled with Illumina sequencing, produces high coverage of homologous SNP (Single Nucleotide Polymorphism) loci. For samples 1–25, 100–1000ng of high-quality genomic DNA was completely digested using Sphl and Mluc1 in a 20–50 µl reaction volume (2.4 U per enzyme, New England Biolabs (NEB), Ipswich, MA, USA) and incubated at 37 °C for 90 min. Adapter ligation was done with Adapters A and B by changing their sticky ends for SphI and Mluc1, respectively. The dual-indexed primers designed were used for the reactions. After adding the indexed primers in PCR, the obtained libraries were pooled based on concentration (Qubit 2.0 fluorometer analysis) and concentrated in a SpeedVac (Eppendorf, Hamburg, Germany). A manual size selection was applied (a range between 450 and 550 bp, which corresponds to DNA fragment size of interest between 310 and 410 bp) in low-melting 1.5% agarose gel electrophoresis (Bio-Rad Laboratories, Hercules, CA, USA). The final libraries were quantified by Qubit 2.0 fluorometer (dsDNA kit, Thermo Fisher Scientific) and their quality was checked on a Fragment Analyzer system (DNA High Sensitivity kit, Agilent).
The sequencing quality of each sample was checked using FastQC v0.11.3 (Babrahm Institute Bioinformatics, Cambridgeshire, UK) in that we have used BAM (Binary Alignment Map) file with ‘Per base sequence quality’ module parameter. The dDocent version 2.2.10 (Babrahm Institute Bioinformatics, Cambridgeshire, UK) was used for quality filtering of sequence data for denovo assembly using the program Trim Galore! Version 0.6.5 (Babrahm Institute Bioinformatics, Cambridgeshire, UK) to find low quality bases (below quality score of 20) are removed from the beginning and end of reads and an additional sliding 5 bp window that will trim bases when the average quality score drops below 10. FreeBayes version 1.1 (University of Florida, Florida) used as variant loci calling tool to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms) and complex events, INDELs (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment with BAM files to separate SNPs from INDELs and developed Variant Call Format (VCF) file that contains all SNPs, INDELs, MNPs and complex events that are called in 90% of all individuals with a minimum quality score of 3030. Minor allele frequency is the frequency of rare alleles and their association with chromosomes leads to erroneous results so that by appliying MAF < 0.05% parameter in TASSEL version 5.0 (Cornell University, Ithaca, New York, USA.) we have reduced marker load. From Dindel, a web based program (Sanger institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK) we called small indels from ddRAD short sequence data. We extended MISA version 2.0 (Leibniz Institute of Plant Biochemistry, Germany) a computational tool assisting the development of microsatellite markers from this we have identified single sequence repeats (SSRs) from ddRAD sequence data. We utilized Primer3 version 4.1.0 (Whitehead Institute for Biomedical Research, Cambridge, UK) to design and select optimal primers for INDELs and SSRs. The selection was based on criteria such as GC content, melting temperature (Tm), primer size, and specified range parameters.
Polymerase chain reactions (PCR) were carried out in a 10 µl reaction mixture containing INDELs and SSRs amplification reaction were carried out in tube using 10 µl volume consisting of 25 ng ofDNA, 10x PCR buffer (10 mMTris, pH 8.4, 50 mMKCl, 1.8 mM MgCl2), 2 mMdNTPs (MBI, Fermentas, Lithuania, USA), 5 µmol each of forward and reverse primer and 3 U of Taq DNA polymerase (Bangalore Genei Pvt. Ltd., Bengaluru, India). The Polymerase chain reaction (PCR) was performed using thermal cycler (Applied Biosystem, Somerset, UK). Initial denaturation at the temperature of 94 °C for 5 min. Thirty five cycles of denaturation at temperature of 94 °C for 0.5 min, primer annealing at respective annealing temperature for 0.5 min and primer extension at 72 °C for 1 min. Final extension step was carried out at temperature of 72 °C for 8 min. To each PCR product of 10 µl was added with 1 µl 6x loading dye by using multichannel pipette (0.25% bromophenol blue; 40% sucrose;0.25% xylene cyanol FF). After adding the dye samples were loaded into individual wells with the help of 10 µl multichannel pipettes. In parallel, gel was loaded with 50 bp size reference ladders (Fermentas, Lithuania, USA). The electric supply was adjusted at 5 Volts/cm of run and the total duration of electrophoresis varied from 1.5 to 2.5 h. After optimum run of DNA fragments the gel slabs were visualized under UV transilluminator and documented in gel documentation system (Bio Rad Laboratories Inc., USA). Based on banding pattern of the gel, the lines were scored manually by visualizing as heterozygous and homozygous as their presence (1) and absence (0) on gel respectively31.
Data analysis
The level of polymorphism was tested with 10 SSRs and 7 INDELs markers by scoring all the individuals from five zones. Even though amount of INDELs and SSRs are smaller but enough to generate reliable results since SSRs and INDELs as DNA markers have advantages over many other markers as they are highly abundant and polymorphic, co-dominantly inherited, analytically simple and readily transferable. SSRs are reported to be more variable than other nuclear markers32. Therefore, these markers would be very useful to assess genetic variation among subpopulations of C. sexmaculata. Till date there has been no report on C. sexmaculata SSR markers. To isolate the SSRs from genomic DNA libraries, several approaches for creating SSR-enriched genomic libraries have been developed, with SSR selection either before or after genomic library construction33. For the SSRs and INDELs, mean number of alleles (Na), percentage of polymorphic loci (PPL), the mean unbiased estimates of genetic diversity (HE)34 and observed heterozygosity (Ho) were estimated using GenAlex 6 software35. The number of migrants (Nm), a measure of historical gene flow was estimated by the private allele method using GENEPOP version 4.7.5 with Hardy-Weinberg and Codominant parameter (Montpellier, France) (Raymond et al.. 1995)36. Nei’s minimum genetic distance was constructed with distance based parameter using GenAlex 6.5 (Australian National University, Canberra, Australia). To determine the genetic relationships among these populations, a NJ dendrogram based on Nei’s genetic distance by carrying out 1000 bootstrap replications using relatedness parameter from TASSEL version 5.0 (Cornell University, Ithaca, New York, USA.). Data were subjected to a hierarchical analysis of molecular variance (AMOVA) as described by (Excoffier et al.. 2005)37 using three hierarchical levels; individual, population and inter-population level. The analysis was performed using genetic structure parameter in Arlequin version 2.037.
Structure analysis
The population structure, a crucial aspect of our study, was examined using the Bayesian clustering algorithm STRUCTURE 2.3.138 to estimate the number of potential genetic clusters (K). The program STRUCTURE, a key tool in our research facilitates testing the veracity of the hypotheses about the number of populations by calculating the probability of the data for each hypothesis. The program assumes that the markers are not in linkage disequilibrium and that they exhibit co-dominance. We ran STRUCTURE for 100,000 steps after a burn-in period of 100,000 steps with 20 replicate runs for each value of K (1–15). Data set for structure analysis of SNPs were whole SNPs sequencing data and to validate further in case of INDELs and SSRs we have selected after DINDEL and MISA selection polymorphic markers to able better co-dominant in nature to resolve hetereozygous subpopulations efficiently. The population structure was analyzed assuming admixture in the population in a correlated allele frequency model. Taking results from the STRUCTURE output file, the number of true clusters in the data (K) was determined using Structure Harvester version 0.6.9339, which identifies the optimal K based on the posterior probability of the data for a given K and the ΔK40. Isolation by Distance (IBD) was calculated using GenAlex 6.5 (Australian National University, Canberra, Australia. For this, the genetic distance matrix was calculated using a computer program, Genetic data analysis and numerical tests GDA version 1.1 (Grosshansdorf, Germany)41 Geographic Distance Matrix Generator Version 1.2.3 (Central Park, New York) using Full NxN Matrix Parameter42.
Data availability
Data availability:“All data generated or analyzed during this study are included in this article. The data available in the online database with the following links- https://datadryad.org/stash/share/DTr4vs62OOFL4uoLjuPFUtKXr2nAzBfelwauphte-tc and https://doi.org/10.5061/dryad.ghx3ffc07”.
References
Goolsby, J. A., Van Klinken, R. D. & Palmer, W. A. Maximising the contribution of native-range studies towards the identification and prioritisation of weed biocontrol agents. Aust J. Entomol. 45(4), 276–286 (2006).
Madeira, P. et al. Whether to release Oxyops vitiosa from a second Australian site onto Florida’s mela- leuca? A molecular approach. Biocontrol 46, 511–528 (2001).
Rauth, S. J., Hinz, H. L., Gerber, E. & Hufbauer, R. A. The benefits of pre-release population genetics: A case study using Ceutorhynchus Scrobicollis, a candidate agent of garlic mustard, Alliaria petiolata. Biol. Control. 56, 67–75 (2011).
Paterson, I. D. et al. Two in one: Cryptic species discovered in biological control agent populations using molecular data and crossbreeding ex- periments. Ecol. Evol. 6, 6139–6150 (2016).
Keerthi, M. C., Sharma, R. K., Suroshe, S. S. & Sinha, S. R. Ecological engineering in cauliflower for aphid management. Indian J. Agric. Sci. 90 (7), 1356–1358 (2020).
Agarwala, B. K. & Ghosh, A. K. Prey records of aphidophagous Coccinellidae in India. A review and bibliography. 1–14 (1988).
Omkar & Srivastava, S. Influence of six aphid prey species on development and reproduction of a ladybird beetle, Coccinella septempunctata. BioControl 48, 379–393 (2003).
Cheng, J., Kao, H. & Dong, S. Population genetic structure and gene flow of rare and endangered tetraena mongolica Maxim. Revealed by reduced representation sequencing. BMC Plant Biol. 20, 1–13 (2020).
Kang, J. H. et al. Population genetic structure of a recent insect invasion: A gall midge, Asynapta groverae (Diptera: Cecidomyiidae) in South Korea since the first outbreak in 2008. Sci. Rep. 13(1), 2812 (2023).
Ruiz-Montoya, L. et al. Mass-Rearing conditions do not always reduce genetic diversity: The case of the Mexican fruit fly, Anastrepha ludens (Diptera: Tephritidae). Insects 15(1), 56 (2024).
Avise, J. C. et al. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18(1), 489–522 (1987).
Krafsur, E. S., Obrycki, J. J. & Harwood, J. D. Comparative genetic studies of native and introduced Coccinellidae in North America. Eur. J. Entomol. 102, 469–474 (2005).
Gregory, T. R., Nedvěd, O. & Adamowicz, S. J. C-value estimates for 31 species of ladybird beetles (Coleoptera: Coccinellidae). Hereditas 139(2), 121–127 (2003).
Schlötterer, C. The evolution of molecular markers- just a matter of fashion? Nat. Rev. Genet. 5(1), 63–69 (2004).
Agarwala, B. K. & Yasuda, H. Overlapping oviposition and chemical defense of eggs in two co-occurring species of ladybird predators of aphids. J. Ethol. 19(1), 47–53 (2001).
Pervez, A. Predaceous coccinellids in India: Predator-prey catalogue (Coleoptera: Coccinellidae). Orient. Insects 38(1), 27–61 (2004).
Yan, J. et al. ddRAD sequencing: An emerging technology added to the biosecurity toolbox for tracing the origin of brown marmorated stink bug, Halyomorpha halys (Hemiptera: Pentatomidae). BMC Genom. 22(1), 1–15 (2018).
Meng, L., Wang, Y., Wei, W. H. & Zhang, H. Population genetic structure of Diaphorina citri Kuwayama (Hemiptera: Liviidae): Host-driven genetic differentiation in China. Sci. rep. 8(1), 1–15 (2018).
Cordeiro, E. M., Campbell, J. F., Phillips, T. & Akhunov, E. Isolation by distance, source-sink population dynamics and dispersal facilitation by trade routes: Impact on population genetic structure of a stored grain pest. G3 Genes Genomes Genet. 9(5), 1457–1468 (2019).
Botstein, D., White, R. L., Skolnick, M. & Davis, R. W. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J. Hum. Genet. 32 (3), 314–331 (1980).
Wood, C. W., Donald, H. M., Formica, V. A. & Brodie III E. D. Surprisingly little population genetic structure in a fungus-associated beetle despite its exploitation of multiple hosts. Ecol. Evol. 3(6), 1484–1494 (2013).
Kronforst, M. R. & Fleming, T. H. Lack of genetic differentiation among widely spaced subpopulations of a butterfly with home range behaviour. Heredity 86(2), 243–250 (2001).
Leite, N. A. et al. Pan-american similarities in genetic structures of Helicoverpa armigera and Helicoverpa zea (Lepidoptera: Noctuidae) with implications for hybridization. Environ. Entomol. 46(4), 1024–1034 (2017).
Palma, J., Maebe, K., Guedes, J. V. C. & Smagghe, G. Molecular variability and genetic structure of Chrysodeixis includens (Lepidoptera: Noctuidae), an important soybean defoliator in Brazil. PLoS One 10(3), e0121260. (2015).
Kumar, R., Suroshe, S. S., Venkanna, Y., Keerthi, M.C., et al. Feeding potential and foraging behaviour of cheilomenes sexmaculata (F.) on cotton whitefly, Bemisia tabaci (Gennadius). Int. J. Trop. Insect Sci. 44, 1609–1623 (2024).
Reddy, P. P. Agro-ecological Approaches to pest Management for Sustainable Agriculture 29–41 (Springer Singapore, 2017).
Khatchikian, C. E., Foley, E. A., Barbu, C. M., Hwang, J., Ancca-Juárez, J., Borrini-Mayori,K. & Chagas Disease Working Group in Arequipa. Population structure of the Chagas disease vector Triatoma infestans in an urban environment. PLOS Negl. Trop. Dis. 9(2), e0003425 (2015).
Kim, S. H. et al. Evaluation of DNA extraction methods and their clinical application for direct detection of causative bacteria in continuous ambulatory peritoneal dialysis culture fluids from patients with peritonitis by using broad-range PCR. Ann. Lab. Med. 32(2), 119–125 (2012).
Rakshith, H. S., Suroshe, S. S., keerthi, M. C., chander, S. & Roy, D. DNA isolation from six spotted Ladybird Beetle, Cheilomenes sexmaculata (F). Indian J. Entomol. 0, 1–5. https://doi.org/10.55446/IJE.2024.2155 (2024).
Jain, A. et al. InDel markers: An extended marker resource for molecular breeding in chickpea. PLoS One 14(3), e0213999 (2019).
Arunkumar, K. P. et al. Genetic diversity and population structure of Indian golden silkmoth (Antheraea Assama). PLoS One 7(8), e43716. https://doi.org/10.1371/journal.pone.0043716 (2012).
Ma, K. H. et al. Isolation and characteristics of eight novel polymorphic microsatellite loci from the genome of garlic (Allium sativum L). Sci. Hort. 122, 355–361 (2009).
Li, X. et al. SSR markers development and their application in genetic diversity evaluation of garlic (Allium sativum) germplasm. Plant. Divers. 44, 481–491 (2022).
Nei, M. Analysis of gene diversity in subdivided populations. Proc. Natl. Acad. Sci. 70 (12), 3321–3323 (1973).
Peakall, R. O. D. & Smouse, P. E. GENALEX 6: Genetic analysis in Excel. Population genetic software for teaching and research. Mol. Ecol. Notes 6 (1), 288–295 (2006).
Raymond, M. GENEPOP (version 1.2): Population genetics software for exact tests and ecumenicism. J. Hered. 86, 248–249 (1995).
Excoffier, L., Laval, G. & Schneider, S. Arlequin (version 3.0): An integrated software package for population genetics data analysis. Evol. Bioinform. 1, 117693430500100003 (2005).
Pritchard, J. K., Stephens, M., Rosenberg, N. A. & Donnelly, P. Association mapping in structured populations. Am. J. Med. Genet. 67 (1), 170–181 (2000).
Earl, D. A. & VonHoldt, B. M. STRUCTURE HARVESTER: A website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv. Genet. Resour. 4, 359–361 (2012).
Evanno, G., Regnaut, S. & Goudet, J. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Mol. Ecol. 14 (8), 2611–2620 (2005). To test for significant isolation by distance, Mantel’s test (Mantel, 1967) Mantel, N. The detection of disease clustering and a generalized regression approach. Cancer Res. 27, 209–220 (1967).
Lewis, P. O. Genetic data analysis: Computer program for the analysis of allelic data. http://hydrodictyon.eeb.uconn.edu/people/plewis/software.php (1999).
Ersts, P. J. Geographic distance matrix generator (version 1.2. 3). American Museum of Natural History. Cent. Biodivers. Conserv. 1, 1–4 (2011).
Acknowledgements
The first author acknowledges the Senior Research Fellowship received from RGNF-NFST, MoTA, New Delhi and Authors also acknowledge facilities received from ICAR-IARI, New Delhi, Govt. of India.
Author information
Authors and Affiliations
Contributions
SSS, AUS, GS: Conceptualization, Methodology, Resources; HSR, SGS: Investigation, Writing—Original Draft; HSR, NB, HK, RKE: Data Curation; KMC & SSS: Data Curation, Writing—Review & Editing. All authors have approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Rakshith, H.S., Suroshe, S.S., Solanke, A.U. et al. Genetic diversity of the zigzag ladybird beetle, Cheilomenes sexmaculata (F.) (Coleoptera: Coccinellidae) with its distribution in India and implications for biological control. Sci Rep 15, 15629 (2025). https://doi.org/10.1038/s41598-025-90794-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-90794-7