Introduction

GBS is a powerful technique for efficiently identifying SNPs across large numbers of samples, enabling genome-wide or tagged genetic analysis1,2. Restriction enzymes are used to cut DNA into specific fragments, which are then tagged with unique barcodes to identify their origin after being pooled for sequencing3,4. Next-generation sequencing was used to sequence the DNA fragments, and the barcode information was used to identify and analyze specific SNP variations within the sequenced data5,6. GBS is a cost-effective method for identifying genes linked to disease resistance, yield, and complex traits, as well as for studying genetic variation within and between populations and assessing gene expression5. GBS employs restriction enzymes to fragment DNA, targeting specific regions while avoiding repetitive sequences, and attaches unique barcodes to the fragments for identification after pooling7,8. GBS involves fragmenting DNA with restriction enzymes, adding barcodes, performing PCR amplification, sequencing, and analysis to identify SNPs, making GBS a valuable tool for plant and animal breeding, genetic studies, and genome-wide association studies9,10. The choice of restriction enzymes (ApeKI, PstI, and EcoRI) in cotton GBS studies was influenced by their ability to target specific DNA regions and avoid repetitive sequences, with EcoRI being less commonly used owing to its less frequent cutting11,12. The selection of restriction enzymes in GBS, such as ApeKI with its recognition sequence GCWTC, is based on factors such as genome coverage, fragment size, and GC content to optimize the analysis in cotton genotyping studies2,13,14. ApeKI is commonly used in GBS because of its frequent cutting, suitable fragment size, and low cost, resulting in good genome coverage and informative marker distribution15. GBS employs adaptors with specific sequences for different sequencing platforms, including RE overhang, spacer and sequencing primer sites, to facilitate DNA fragment ligation and sequencing initiation, while barcodes are used to identify and differentiate between samples during the workflow16,17,18,19,20. Barcodes are unique nucleotide sequences assigned to samples to prevent misidentification during GBS, whereas SNPs are single-nucleotide variations within a genome, such as a change from C to G in a DNA sequence, occurring at a frequency of approximately one per 1000 nucleotides1,21,22,23. Millions of SNPs exist in complex organisms such as humans, with some influencing disease susceptibility and traits, such as plant response to pesticides or disease resistance. Identifying these SNPs in polyploid cotton can help link them to diseases and yield traits, enabling the selection of resistant genotypes with specific SNP alleles24,25,26. SNPs offer more detailed genetic resolution than markers such as SSRs do, enabling the precise identification of genes linked to traits such as disease resistance and yield. High-throughput sequencing has made SNP analysis cost-effective, facilitating the discovery of whitefly resistance-related SNPs and the identification of nearby candidate genes involved in defense mechanisms27,28,29. Identifying whitefly resistance-related SNPs through GBS enables marker-assisted selection of resistant genotypes, reduces field exposure and pesticide dependency, and paves the way for developing genetically resilient cotton varieties30. QTLs are specific chromosomal regions containing multiple genes that contribute to traits such as whitefly resistance and are identified through statistical analysis of genetic markers and phenotypic data from crosses between resistant and susceptible parents, revealing the complex genetic architecture of resistance31,32,33. QTL mapping identifies chromosomal regions containing multiple genes involved in complex traits such as whitefly resistance, aiding in the development of marker-assisted selection programs for breeding resistant cotton varieties by utilizing DNA markers linked to these QTLs34,35. Integrating SNP-based association mapping, GBS, and QTL mapping provides a comprehensive understanding of whitefly resistance genetics in cotton, with QTL mapping being crucial for developing resistant varieties because of the absence of a complete SNP-mapped QTL reference map36. GBS can identify whitefly resistance genes by comparing resistant and susceptible genotypes, and SNP markers can help elucidate the function of genes such as Mi-1.2 in cotton defense. Studying gene expression in whitefly infested cotton can further reveal resistance mechanisms37,38,39. Comparing upregulated genes in resistant and susceptible cotton genotypes during whitefly infestation, along with identifying resistance markers using GBS, can lead to the development of new, highly resistant cotton varieties with reduced pesticides dependency by validating candidate genes and discovering additional resistance genes and pathways.30,40. By continuing research on whitefly resistance in cotton, scientists have aimed to develop long-lasting, whitefly immune varieties and accessible DNA markers for large-scale breeding programs, leading to a more sustainable cotton production system41.

Materials and methods

First, during the cropping season of 2021–22, we screened both resistant and susceptible parents for whitefly resistance42. Two resistant parents, namely, CA-12 and AGC-555, and two susceptible parents, namely, GOMAL-105 and SLS-87/175, were screened for whitefly resistance and susceptibility. In the next cropping season, 2022–23, these selected parents were crossed such that CA-12 was crossed with GOMAL-105, and AGC-555 was crossed with SLS-87/17543. After that, the F1 generation was grown in a glasshouse to obtain seeds to grow F2 in the next cropping season44. In the next cropping season, 2023–24, the F2 generation was grown in the field and evaluated for phenotypic data and genotypic data. The F2 generation obtained from a cross between CA-12 and GOMAL-105 was subjected to further studies, and the F2 generation obtained from a cross between AGC-555 and SLS-87/175 was discarded because more than 50% sterile plants were present45,46. The F2 germplasm was cultivated during the 2023–24 cropping season on 200 m2 at the research farms of Patron Seeds (71°21’30” E, 29°59’30” N). Cotton beds measuring 60 cm in length with 30 cm furrows were prepared via a bed planter. Seeds were manually sown (Chopa method) in a zigzag pattern along both edges of the beds at a spacing of 30 cm per plant47. Since non-delinted cottonseeds were used, 3–4 seeds were sown per chopa to increase germination rates48. Following sowing, standard agronomic practices were implemented to ensure optimal crop growth, except for pesticide application for whitefly control49. Young leaves from 30-day-old plants were collected, frozen, and stored for DNA extraction via the CTAB method to identify genetic variations50. Phenotypic data was recorded as follows: Days to First Flower (DTF) data was collected daily after 35 days of sowing and continued until 47 days post-sowing51. Data regarding Flowers/plants (FP) was collected daily over a period of three weeks, commencing 35 to 47 days following sowing, for all genotypes52. We collected morphological data for Nodes to 1st monopodia (NTM), Monopodia/Plant (MP), Sympodia/plant (SP), Leaf Length (LL), Leaf Width (LW), and Petiole Length (PL) after 90 days of sowing. Additionally, we calculated Leaf Area (LA) using the Grid Method of Leaf Area Measurement53. The tolerance data for Whitefly adults and nymphs, commonly known as the Whitefly count (WC), were taken from the field 60 days after sowing. The data was obtained three times, each with a 30-day interval. Whitefly data was acquired by randomly selecting three plants and counting whitefly adults and nymphs on the upper leaf of the first plant, the middle leaf of the second plant, and the lower leaf of the third plant54. The data for plant height (PH) and yield parameters were recorded at maturity, including Bolls/plant (BP), Bolls weight (BW), Yield/plant (YP)55. Fibre attributes were calculated following crop picking, including seed index (SI), lint index (LI), ginning outturn (GOT), and staple length (SL)56. DNA purity was assessed via spectrophotometry and gel electrophoresis, and the DNA samples were subsequently sent for GBS analysis following the protocol of Elshire57 (USA)58,59. Library preparation for GBS was carried out by using high-quality DNA extracted from plant tissues, and the ApeK1 enzyme with the GCWTC recognition site was used to cut DNA14,50,57,60. Frequent cutting, suitable fragment size and low cost were the causes of ApeKI selection in GBS. Adapters having primer sites and barcodes were ligated with DNA fragments for sample identification and PCR amplification15 Sequencing was performed on Illumina platforms using 50–150 bp reads, with sequencing-by-synthesis chemistry detecting fluorescently labeled nucleotides to generate raw FASTQ data61,62. After sequencing, barcodes were used to identify sample origins and reads aligned to a reference genome to identify SNPs and other genetic variations6,63,64,65. Bioinformatics tools were used to identify and confirm SNPs across the samples, enabling the analysis of genetic diversity, population structure, and trait associations within the cotton germplasm66. Raw sequence data from GBS, consisting of millions of short DNA fragments (50–150 bp) in Fastq format, were processed via bioinformatics tools to identify barcodes and differentiate reads from different pooled cotton samples67,68. Fragment size distribution was measured, and GBS data were aligned to a reference cotton genome to identify SNPs and other genetic variations, enabling the analysis of genetic diversity, population structure, and trait associations within the cotton germplasm69,70,71. Comparing identified SNPs with phenotypic data helps identify those associated with yield-related traits and whitefly resistance. Aligning SNPs to the genome determines their exact location and predicts their functions, whereas variation in alignment across samples reveals individual differences. Specialized software is used for accurate GBS data alignment because of the complexity of the cotton genome72,73,74. Burrows wheeler aligner (BWA) was used to align GBS data, followed by quality control using variant calling tools to identify and filter low-quality SNPs, which were then annotated to understand their potential functions75,76. Annotation tools associate SNPs with specific genomic regions, including coding, regulatory, and noncoding regions, to predict their potential impact on gene expression, protein function, and cellular processes, aiding in the identification of disease-related SNPs and understanding their functional significance77,78,79. Annotation in GBS adds biological context to SNPs, pinpointing their exact location and linking them to specific genes to understand their potential impact on phenotypes and diseases, aiding in prioritizing SNPs for further analysis and improving reference genomes and annotation databases80,81,82,83. Bioinformatics tools such as TASSEL-GBS, TASSEL, JOINMAP, Win QTL Cartographer, and MAPCHART were used for preprocessing, quality control, demultiplexing, alignment, variant calling (SNP detection), and downstream analysis, including QTL detection84,85,86. TASSEL is a bioinformatics tool used for analyzing GBS data, including SNP calling, database building, barcode processing, read alignment, SNP discovery, quality filtering, population structure analysis, GWAS, and MAS, and requires computational resources for large datasets87,88,89. TASSEL-GBS, a specialized tool for plant breeding, uses SNP calling algorithms to identify SNPs from aligned reads in its database, with alignment to a reference genome being a crucial step in the SNP discovery process90,91. TASSEL-GBS uses its own internal alignment engine for SNP calling, unlike other pipelines, which offer choices such as BWA, which is a tool for aligning reads to a reference genome92. SNP filtering was performed via TASSEL on the basis of minor allele frequency, missing data rate, coverage depth, segregation distortion, and heterozygosity to improve data quality and reduce the computational burden. The remaining 8,889 SNPs were imputed via LD-kNNi in TASSEL, with imputation accuracy assessed by masking known genotypes and comparing them to predicted alleles93. Highly filtered SNP data in VCF format were used to construct a linkage map by calculating recombination frequencies between linked markers via algorithms such as the Kruskal‒Wallis test and regression, which were used to assess their separation during meiosis94,95,96. JOINMAP was used to construct a linkage map via high-quality SNP data and pedigree information; this map represents the linear order of markers within linkage groups on the basis of recombination frequencies; facilitates assessments of map length, marker distribution, and potential genotyping errors; and serves as a foundation for QTL studies and marker-assisted selection97,98,99. A high-density linkage map was constructed using recombination frequencies (Kruskal‒Wallis test) and visualized (MapChart), enabling marker-assisted selection.

WinQTL Cartographer2.5_011 (https://brcwebportal.cos.ncsu.edu/qtlcart/WQTLCart.htm) was used to perform composite interval mapping (CIM) analysis on a genetic linkage map (constructed from SNP information) to identify QTLs associated with the traits of interest, utilizing phenotypic data100,101. QTLs were designated and prioritized for further analysis on the basis of their highest LOD scores and phenotypic variances (R2) and were named via a convention (e.g., qPH for plant height, qFP for flowers per plant, and qWC for whitefly count) after being identified via a 1000-iteration permutation procedure to determine the LOD threshold for each trait38,102. QTLs were designated via a naming convention (e.g., qPH for plant height, qFP for flower per plant, and qWC for whitefly count) and prioritized for further analysis on the basis of their highest logarithm of odds (LOD) scores and phenotypic variations (R2). Genes within identified quantitative trait loci (QTLs) were positioned on the basis of genetic distances on the map, followed by gene mining with Cotton FGD (https://cottonfgd.org/) to identify 377 candidate genes in detail in Supplementary Tables 1 & 2. Prioritization of these candidates for whitefly resistance was achieved via TAIR (https://www.arabidopsis.org/), and relevant literature was used to assess their predicted functional roles103. The Mi 1.2 genes in Solanum lycopersicum, which were identified as key factors in tomato whitefly resistance, suggested that their cotton orthologs may also play a role in whitefly resistance104,105. The mRNA sequence of the Mi 1.2 gene (NM_001247134.1) was used in a BlastX search against the TAIR 10 protein database to identify homologous genes in A. thaliana, followed by a similar BlastX search of 211 G. hirsutum genes to identify homologs of known whitefly resistance genes.

Results

Genetic linkage map

A total of 3375 SNPs were identified in the F2 population via GBS, with 1793 SNPs selected on the basis of missing data and heterozygosity. After 996 SNPs were converted to the ABH genotype, a linkage map was constructed via JOINPAM 4.0106. The groups were established with a minimum logarithm of odds (LOD) threshold of 10.0 marker orders and were estimated via the regression mapping algorithm. The recombination fractions were converted to map distances via the Kosambi mapping function107. The strongest cross-link (SCL) information in combination with the known mapped locations of the ungrouped SNPs was used to merge these markers with the clearly identified linkage groups of G. hirsutum.108.

These 996 SNPs were analyzed via JOINMAP 4.0 software. The linkage group output and map position were subsequently determined through MapChart 2.2 for Windows, and the process was used to compute and display 3-D map-based marker distances109,110. One hundred thirty (130) SNPs were distributed in 19 linkage groups. The remaining 866 SNPs were not linked and were excluded from the maps or linkage groups. The LGs were numbered (1–19) on the basis of the assigned chromosome numbers111.

The LODs of the markers/loci ranged from 2 to 10. The basic information of the linkage groups (LGs) is presented in (Table 1). The current map spanned only 3213.3 cM, with an average marker density of 2.28 cM112. The genetic length of the LGs ranged from 10.1 cM (Chr/LG A3, Chr/LG A13) to 1265 cM (Chr/LG D06). On average, one linkage group presented approximately 38 SNP markers that covered an average of 34.2 cM. The most common marker-covered linkage group was 38 (Chr/LGD06), which had 38 markers with an average marker density of 0.03 cM. In contrast, linkage groups A09, A12, A13, D02, D07, D09, D10 and D12 each had the lowest number of SNP markers (only 2; Table 1; Fig. 1).

Table 1 Summary statistics of the genetic linkage map of G. hirsutum. This table provides an over view of the genetic linkage map constructed for G. hirsutum. Each row represents a specific linkage group (chromosome), and the columns provide the following information: group: the chromosome identifier. Marker_num: the number of markers mapped to the chromosome. Map_Len(cM): the total genetic length of the chromosome in centimorgans (cM). Marker_Density (markers/cM) Ave_Interval (cM): the average distance between adjacent markers on the chromosome. Max_Gap(cM): the maximum genetic distance between any two adjacent markers on the chromosome.
Fig. 1
figure 1

Genetic linkage map of tetraploid cotton. (A) Chromosomes are ordered vertically with the D subgenome (orange, left) and A subgenome (green, right). (B) Scale bar indicates genetic distance in cM. (C) Filled circles represent molecular markers with positions determined by recombination frequency. (D) Absence of A02, A04, and A08 chromosomes reflects insufficient marker coverage for these linkage groups.

QTL analysis

Five QTLs for Nodes to 1st Monopodia were identified on chromosomes 1, 7, 13, 19 and 21, designated as qNTM1_A01, qNTM2_A07, qNTM3_A13, qNTM4_D06, and qNTM5_D08, respectively113,114,115,116 (Table 2). QTLs for whitefly count (qWC1_A04), flower per plant (qFP1_D09), nodes to 1st monopodia (qNTM5_D08), and plant height (qPH1_A09) were identified on chromosomes A04, D09, D08 and A10, respectively, with LOD scores of 4.980, 5.944, 11.560 and 6.202 and phenotypic variances of 0.250, 0.298, 0.263 and 0.211117,118 (Fig. 2). The QTLs qWC1_A04 and qFP1_D09 were further investigated to identify genes associated with whitefly resistance and flowers per plant, respectively (Fig. 3).

Table 2 Quantitative trait loci (QTLs) associated with yield and fiber quality traits in Gossypium hirsutum. This table presents a comprehensive list of QTLs identified for yield and fiber quality traits in G. hirsutum. Each QTL is characterized by the following information: chr: chromosome number position (cM): the genetic position of the QTL in centimorgans start bp; the starting base pair position of the QTL end bp; the ending base pair position of the QTL No. of genes; the number of genes within the QTL interval LOD; the logarithm of the odds ratio, indicating the strength of the QTL effect R2; and the proportion of phenotypic variance explained by the QTL Additive_effect: the additive effect of the QTL allele Dominant_effect: the dominant effect of the QTL allele. qNTM1_A01 nodes to 1st monopodia 1, qWC1_A04 whitefly count, qNTM2_A07 nodes to 1st monopodia 2, qDFF 1_A07 days to 1st flowering, qSI1_A07 seed index, qPH1_A09 plant height 1, qMP1_A09 monopodia per plant 1, qWC2_A09 whitefly count 2, qNTM3_A13 nodes to 1st monopodia 3, qLA1_A13 leaf area 1, qGOT1_A13 ginning outturn 1, qSL1_D04Staple length 1, qPH2_D06 plant height 2, qNTM4_D06 nodes to 1st monopodia 4, qSL2_D06 staple length 2, qSI2_D06 seed index 2, qNTM5_D08 nodes to 1st monopodia 5, qLA2_D08 leaf area 2, qLI1_D08 Lint index 1, qFP1_D09 flowers per plant 1.
Fig. 2
figure 2

This figure presents a genetic map of cotton (Gossypium hirsutum) for chromosomes A01, A04, A07, A09, A13, D04, D06, D08 and D09. The map was constructed via MapChart. The following markers associated with specific traits are highlighted: chromosome A01: black marker: nodes to 1st monopodia 1 (NTM1) chromosome A04: purple marker: whitefly count (WC) chromosome A07: yellow marker: nodes to 1st monopodia 2 (NTM2); pink marker: days to 1st flowering (DFF); green marker: seed index (SI) chromosome A09: blue marker: monopodia per plant 1 (MP1) pink marker: plant height 1 (PH1) chromosome A13: white marker: leaf area 1 (LA1) blue marker: nodes to 1st monopodia 3 (NTM3) yellow marker: beginning 1 (GOT1) chromosome D04: light green marker: staple length 1 (SL1); chromosome D06: Burgundy marker: seed index 2 (SI2) cyan marker: staple length 2 (SL2) olive marker: nodes to 1st monopodia 4 (NTM4) chromosome D08: yellow marker: nodes to 1st monopodia 5 (NTM5) pink marker: leafy area 2 (LA2) olive marker.

Fig. 3
figure 3

Genome-wide association study (GWAS) Manhattan plot of Cotton genome variants. The plot displays -log10 (p-values) of genetic associations across chromosomes (x-axis) with phenotypic traits of interest. Chromosomes are alternately colored for clarity (odd-numbered in blue, even-numbered in orange). The horizontal red line indicates the genome-wide significance threshold (p < 5 × 10^-8). Several peaks above the threshold suggest potential candidate loci for further investigation. Note: Some chromosomal positions show missing data points due to technical limitations in variant calling.

Candidate genes

A total of 211 and 166 genes were identified for whitefly resistance and flowers per plant, respectively, and only a subset was directly associated with these traits, leading to the use of BlastX searches with TAIR to identify genes homologous to known whitefly resistance genes in Arabidopsis thaliana within the identified gene set119. The mRNA sequence of the Mi 1.2 gene (NM_001247134.1) was depicted via BLASTX search against the TAIR 10 protein database to identify homologous genes in A. thaliana, followed by a similar BLASTX search of 211 G. hirsutum genes to identify homologs of known whitefly resistance genes, revealing 10 potential candidate genes for whitefly resistance Supplementary Table 1. Additionally, 10 of the 166 genes were directly associated with flowers per plant120 (Supplementary Table 2).

Discussion

Our comprehensive QTL mapping study offers significant insights into the genetic architecture of whitefly resistance and agronomic traits in Gossypium hirsutum121. By conducting an extensive composite interval mapping analysis on an F₂ segregating population, we identified multiple stable QTLs that exhibit significant phenotypic effects122. This includes five loci responsible for NTM and two primary QTLs (qWC1_A04 and qFP1_D09) linked to WC and FP respectively. The statistical robustness of these findings is substantiated by rigorous permutation testing (1000 iterations), with qNTM_D08 standing out as particularly significant due to its high LOD score (8.67) and substantial contribution to phenotypic variance (27.93%). These results significantly advance our understanding of cotton genetics, with the chromosomal distribution patterns of identified QTLs both confirming known gene clusters for defense responses and revealing novel genetic associations123. Of particular interest is the qWC1_A04 locus on chromosome A04, which contains homologs of the well-characterized Mi-1.2 resistance gene from tomato, suggesting evolutionary conservation of defense mechanisms against sap-sucking insects across divergent plant species124.

The biological interpretation of our candidate gene analysis reveals a sophisticated, multi-tiered defense system in cotton. The presence of LRR domain-containing proteins (At4g27190 and RPPL1) within QTL intervals strongly suggests the operation of pathogen-associated molecular pattern (PAMP)-triggered immunity125, while XA21-like genes likely participate in intracellular kinase signaling cascades that amplify defense responses126. Furthermore, the co-localization of secondary metabolism genes with resistance QTLs indicates probable biosynthesis of antixenosis compounds such as terpenoids and flavonoids, which may deter whitefly feeding and oviposition127. For floral development traits, the identification of GUS1 within qFP1_D09 provides compelling evidence for hormonal regulation of flowering, as this gene’s known functions in auxin and cytokinin metabolism directly influence floral meristem initiation and differentiation128. Similarly, MBDL4’s critical role in managing oxidative stress likely protects developing reproductive tissues from pest-induced physiological damage, thereby maintaining yield potential under infestation pressure129.

All these findings about whitefly resistance genes are in line with previous results including, WRKY 40 and Copper protein genes plays role as hub genes against whitefly in cotton130 secondary metabolites131 defense protein like CYS6 in tobacco132, β-glucosidase in squash plants133, upregulation of β-1,3-glucanase, chitinase and peroxidase in tomato134, up regulation of polyphenol oxidase in cucumber135 and peroxidase and polyphenol oxidase in tomato and soybean plants136. While these findings represent significant advances, several methodological considerations warrant discussion. The F₂ population design, while valuable for initial QTL detection, presents certain limitations regarding effect size estimation accuracy due to residual heterozygosity137. Additionally, our single-environment study design precludes assessment of QTL stability across different growing conditions, an important consideration for breeding applications138. The relatively large physical intervals of some QTLs, particularly the 19.1 Mb span of qWC1_A04, necessitate future fine-mapping efforts to distinguish true causal variants from linked neutral polymorphisms139. These limitations notwithstanding, our results provide a strong foundation for subsequent research and practical applications.

Looking forward, several promising research directions emerge from this work. Development of recombinant inbred line populations will enable more precise QTL validation and effect size estimation, while CRISPR-Cas9 genome editing of prioritized candidate genes (particularly At4g27190 and RPPL1) can establish definitive proof of function35. Comprehensive field trials across multiple environments and growing seasons will be essential to evaluate QTL stability and genotype-by-environment interactions140. Complementary transcriptomic analysis comparing resistant and susceptible lines under infestation conditions will further elucidate the gene regulatory networks underlying these traits141. From a practical breeding perspective, these findings immediately enable marker-assisted selection using the robustly identified qWC1_A04 and qFP1_D09 loci, while also informing strategic pyramiding of complementary resistance mechanisms142. The well-supported candidate genes identified here also provide excellent targets for precision breeding approaches, including both transgenic and cisgenic strategies143. The integration of these genomic tools with conventional breeding methodologies promises to accelerate the development of whitefly-resistant cotton cultivars with improved yield stability and reduced pesticide dependence, addressing a critical need in sustainable cotton production systems worldwide144.

Conclusion

Our study reveals the complex genetic architecture underlying whitefly resistance and yield in cotton. By utilizing GBS, we identified significant QTLs and candidate genes associated with these traits. Genes such as At4g27190 and RPPL1, which are involved in biological processes such as ADP binding, protein binding, LRR-mediated signal transduction, cell adhesion, DNA repair, transcription, and immune responses in Arabidopsis, suggest their potential importance in cotton defense against sucking pests such as whiteflies. The genes associated with flower development, GUS1, and stress tolerance, MBD4L, contribute to increased flower production and overall plant vigor. The identification of these genes provides valuable insights for developing whitefly resistant cotton cultivars with improved yield potential. Marker-assisted selection and gene editing techniques can be employed to introduce these beneficial traits into elite cotton cultivars. To confirm the precise role of candidate genes in whitefly resistance and yield-related traits, further functional validation, such as gene expression analysis and overexpression and knockout experiments, is necessary. A deeper understanding of their molecular mechanisms, including gene and protein interactions, is essential. Combining whitefly resistance with other desirable traits, such as drought tolerance, heat stress tolerance and fiber quality, can lead to the development of multitrait-resistant cotton varieties. The research presented here provides a strong foundation for developing sustainable and high-yielding cotton cultivars that can effectively combat whitefly infestations and adverse environmental conditions.