Abstract
The HD-ZIP gene family plays a crucial role in plant growth, development, and responses to environmental stressors. Nevertheless, there exists a paucity of information regarding this gene family in Helianthus annuus (Sunflower). In the present investigation, a total of 55 putative HaHD-ZIP genes were identified and subsequently classified into four subfamilies based on phylogenetic analysis, further substantiated through the analysis of gene structures and conserved motifs. An analysis of the promoter regions of HaHD-ZIP genes revealed the existence of numerous diverse cis-regulatory elements. Furthermore, we identified 15,577 binding sites for the HD-ZIP transcription factor within the sunflower genome, distributed across 9,479 unique genes. The analysis of protein-protein interactions elucidated the existence of three distinct clusters of HaHD-ZIPs, within which A0A251U614, HaHD-ZIP48, and LBD1 proteins were identified as the most interactive proteins. Additionally, gene duplication analysis revealed that two genes were tandem duplicated, while eight genes were subjected to segmental duplication, underscoring the importance of these genes in the expansion of the HD-ZIP gene family. Expression analysis indicated notable upregulation of the HaHD-ZIP4 gene compared to other analyzed genes under water deficit stress conditions. The results will provide significant insights into the further functional characterization of drought-responsive HaHD-ZIP genes in Sunflower.
Similar content being viewed by others
Introduction
The world population is expected to reach 9 billion by 20501. However, the crops are negatively influenced by different abiotic and biotic stresses2. To tackle these challenges, plants have developed several methods3. Transcription factors play a crucial role in regulating abiotic and biotic stress responses. For example, several transcription factors can bind to the cis-element region of a target gene to activate or repress its expressions. In contrast, the overexpression of a transcription factor can induce or suppress downstream genes2. Among the transcription factors, HD-ZIP proteins are specific to plants, and it was reported that they are involved in abiotic stress responses4. The HD-ZIP protein family comprises a homeodomain (HD) linked to a leucine zipper motif (LZ) that is responsible for DNA binding and protein dimerization, respectively5. The HD-ZIP proteins are further classified into four subfamilies (I-IV) based on the conserved HD-ZIP domains, gene structures, and functions6. The subfamily HD-ZIP I consists of a highly conserved HD and a contrasting less conserved LZ, while the HD-ZIP II subfamily contains a highly conserved HD and LZ, along with two additional motifs: a CPSCE (Cys, Pro, Ser, Cys, and Glu) motif downstream of the LZ motif and an N-terminal consensus sequence. HD-ZIP I and II proteins bind to similar pseudo-palindromic sequences CAATNATTG but with different central nucleotides A/T and C/G in subfamily I and II genes, respectively7. Both the HD-ZIP III and HD-ZIP IV subfamilies contain the START (STeroidogenic Acute Regulatory protein-related lipid Transfer) and the START adjacent (SAD) domain. However, only the HD-ZIP III subfamily has an additional MEKHLA domain that cannot be found in the HD-ZIP IV subfamily8. Members of the HD-ZIP III family recognize the sequence (GTAAT(G/C)ATTAC), while HD-ZIP IV proteins bind to the sequence (TAAATG(C/T)A)9. Several studies were carried out to understand the functions of HD-ZIP genes. It was found that HD-ZIP I proteins are involved in responses to abiotic stress, abscisic acid (ABA) signaling, de-etiolation, blue light signaling, plant embryogenesis, and the regulation of plant growth and development processes10. The HD-ZIP II subfamily genes respond to light, shading, and auxin signaling11. It is reported that the HD-ZIP III subfamily is involved in embryogenesis, meristem regulation, lateral organ initiation, leaf polarity12vascular system development13 and Auxin transport14. HD-ZIP IV proteins play a crucial role in the differentiation of epidermal cells, trichome formation, root development, and anthocyanin accumulation15. Genome-wide analysis studies have identified members of the HD-ZIP gene family in numerous plant species, including soybean (Glycine max)16 wheat (Triticum aestivum L.)17 maize (Zea mays)18 peach (Prunus persica)19 potato (Solanum tuberosum)20 sesame21 rice (Oryza sativa)22 and so on. Sunflower (Helianthus annuus L.) is an important oilseed crop in the world and is particularly noteworthy for its ability to thrive in water-deficit conditions; however, during growth, Sunflower grain quality and its oil content and fatty acid composition can be affected by water-deficit conditions23,24. It is therefore important to study the sunflower for better yield and quality, and it can be advantageous to identify elite genes. Several transcription factors, including AP2/ERF25 WRKY26 WOX27 and NAC28 have been analyzed in sunflowers, although there is still no genome-wide characterization of the HD-ZIP family in sunflowers.
In this study, we carried out identification and bioinformatics analyses of the HD-ZIP gene family in Sunflowers under water deficit stress. The detailed analysis included evolutionary relationships, chromosomal locations, gene structures, conserved motifs, subcellular localization, cis-acting elements, selection pressure for duplicated gene pairs, and protein-protein interactions.
Moreover, the expression patterns of some HD-ZIP genes in the leaf of Helianthus annuus were investigated by quantitative real-time PCR (qPCR) in response to water deficit stress. These findings offer a valuable perspective for future research on the functions of stress-responsive HD-ZIPs in the sunflowers.
Results
Identification and sequence analysis of HD-ZIP genes in sunflower (Helianthus annuus L.)
Fifty-five non-redundant HD-ZIP genes were identified in sunflowers, labeled as HaHD-ZIP1–55, based on their physical location on chromosomes 1–17. These genes were then used for further analyses. The lengths of the protein sequences for the identified HD-ZIP genes in Helianthus annuus vary and range from 181 amino acids for HaHD-ZIP7 to 848 amino acids for both HaHD-ZIP11 and HaHD-ZIP50.
Correspondingly, the molecular weights of these proteins ranged from 21.23 kDa to 93.11 kDa, and their isoelectric points were between 4.53 and 9.33. The information on the additional parameters of nucleic acid and protein sequences of the HaHD-ZIP proteins and their physicochemical characteristics is shown in Supplementary Table S1 and Table S2, respectively. Analysis of the subcellular localization of HaHD-ZIP proteins shows that most identified proteins were localized in the cell nucleus. However, HaHD-ZIP4 was located in the cytoplasm, and some proteins were found in both the nucleus and cytoplasm (Supplementary Table S3).
Multiple sequence alignment and phylogenetic analysis
To investigate the classification of HD-ZIP proteins in H. annuus, a phylogenetic tree was constructed using the HD-ZIP protein sequences in Arabidopsis and H. annuus (Fig. 1). The results indicated that the HD-ZIP protein family was phylogenetically divided into four subfamilies (HD-ZIP I–IV) in both species. The HD-ZIP I subfamily has the largest number of members and comprises 19 HaHD-ZIP and 17 AtHD-ZIP, while the HD-ZIP III subfamily is the smallest, with only 9 HaHD-ZIP and 5 AtHD-ZIP members. The HD-ZIP II subfamily comprises 16 and 10 members of HaHD-ZIP and AtHD-ZIP, whereas the HD-ZIP IV subfamily contains 11 HaHD-ZIP and 16 AtHD-ZIP. It is likely that the HD-ZIP genes with conserved functions tend to cluster within the same subgroup and may share a common evolutionary origin.
Phylogeny of the HaHD-ZIP gene family. The phylogenetic trees of the HD-ZIP gene family were constructed using all 55 HaHD-ZIP genes and the HD-ZIP genes of Arabidopsis thaliana (At) as outgroup. (a): Phylogenetic analysis of HD-ZIP genes in common sunflower. (b): A phylogenetic tree of Sunflower and Arabidopsis HD-ZIP proteins. The different-colored branches showed different subfamilies. The red and blue circles represent HD-ZIP genes from Sunflower and Arabidopsis, respectively. The phylogenetic tree was constructed using the neighbor-joining (NJ) method with 1,000 bootstrap replications.
Analysis of gene structures and conserved domains
To further investigate the evolutionary relationship of sunflower HD-ZIP proteins, the conserved motifs of the HD-ZIP gene family in sunflowers were analyzed using the MEME tool (Fig. 2b). The results identified the presence of ten conserved motifs, including Motif1 and Motif2, which belong to the homeodomain (HD), and Motif3, which corresponds to the leucine zipper (LZ), were uniformly present in all HaHD-ZIP genes. In addition, the genes associated with the same subfamily had a relatively identical motif structure. For example, motifs 4, 6, and 10 associated with the START domain were present in both the HD-ZIP III and HD-ZIP IV subfamilies. However, Motifs 5 and 9, which correspond to the MEKHLA domain, were only found in subfamily III. These group-specific motifs classified the HD-ZIP gene family into four distinct subfamilies, which corresponded to the division in the phylogenetic analysis (Fig. 2a), and may imply various functions of the HD-ZIP family in sunflowers.
Gene structure analysis showed that members of the same subfamily have a similar pattern in exon-intron profiles and exon-intron numbers (Fig. 2c). For example, the HaHD-ZIPI and HaHD-ZIPII genes had 2–4 exons, whereas the HaHD-ZIPIV genes contained 6–12 exons. The genes of HaHD-ZIPIII subfamily, with 16–18 exons, had the highest number of exons and were longer than other subfamilies (Fig. 2c).
Gene structures and conserved motifs of HaHD-ZIP genes. (a) phylogenetic tree of HD-Zip genes. (b) The distribution of conserved motifs across HaHD-Zip genes. (c) The exon-intron structures of HaHD-ZIP genes. Orange and green boxes represent untranslated regions (UTRs) and coding sequences (CDSs). Black lines indicate introns. The different-colored branches in the phylogenetic tree represent different subfamilies: subfamily I (green), subfamily II (blue), subfamily III (red), and subfamily IV (purple).
Chromosomal location
Fifty-five HaHD-ZIP genes were unevenly distributed across all 17 sunflower chromosomes (Fig. 3). Chromosome 17, with six genes, had the highest gene count, while chromosomes 6, 7, 8, and 10 had the lowest gene count with just one gene. Chromosomes 3, 11, and 14 contained two genes, chromosomes 1 and 4 had three genes, chromosomes 2 and 16 had four genes, and chromosomes 5, 9, 12, 13, and 15 had five genes.
Chromosomal location and distribution of HD-Zip genes in Sunflower. Tandem duplicated genes are marked with a red box. Different colors represent different subfamilies: subfamily I (green), subfamily II (blue), subfamily III (red), and subfamily IV (purple). The left axis shows the length of each chromosome, and it was estimated in megabase pairs(Mb).
Cis-regulatory elements analysis
Several cis-elements were identified in the HaHD-ZIP promoters (Fig. 4), and except for conventional promoter elements such as CAAT-box and TATA-box, the remaining elements were classified into four categories based on their functions, including nine stress-related elements, 12 hormone-responsive elements, 28 light-responsive, and 10 development-related elements, of which, the most common elements were hormone-responsive elements. Based on the analysis of cis-elements in each group, a stress-responsive cis-element, ARE, was abundant and was involved in anaerobic induction. Among hormone-related elements, ABRE, which is involved in the response to abscisic acid, was the most common element. CAT-box, a cis-acting regulatory element related to meristem expression, and G-box, a cis-acting regulatory element, involved in light responsiveness were abundant among development-related and light-responsive elements, respectively.
Syntenic, collinearity, and ka/ks analysis
Investigation of gene duplication events is essential for understanding the proliferation and evolution of gene families. Tandem duplication is characterized by the presence of two or more genes situated within a proximity of 200 kb on the same chromosome. Conversely, segmental duplication, entails the positioning of genes on either distinct or identical chromosomes but at greater distances29. The analysis of gene duplication within sunflower HD-ZIP genes indicated that four gene pairs underwent segmental duplication (HaHD-ZIP7/HaHD-ZIP40, HaHD-ZIP8/HaHD-ZIP37, HaHD-ZIP18/HaHD-ZIP51, and HaHD-ZIP25/HaHD-ZIP46), while only a solitary gene pair (HaHD-ZIP16/HaHD-ZIP17) is duplicated tandemly (Fig. 5a). The findings suggested that four members of HaHD-ZIPIV are implicated in segmental duplication, indicating their proliferation throughout evolutionary processes, and they may fulfill a critical function in sunflowers.
The ratio of nonsynonymous (Ka) to synonymous (Ks) substitution rates was computed to estimate the selection pressure of the duplicated HaHD-ZIP gene pairs. A Ka/Ks ratio exceeding 1, equal to 1, and less than 1 corresponds to positive, neutral, and negative purifying selections, respectively30. In the present study, the Ka/Ks values ranged from 0.05 to 0.27, all remaining below 1, implying that purifying selection is a predominant factor influencing the divergence of HaHD-ZIP genes. Furthermore, segmental duplication events are estimated to have occurred between 12.4 and 40.11 million years ago, while tandem duplication occurred around 27.5 million years ago (Supplementary Table S4).
Furthermore, an investigation into the collinearity of the HD-ZIP gene family in both sunflower and Arabidopsis was conducted to elucidate the evolutionary relationships between these two species (Fig. 5b). The results demonstrated that 12 HaHD-ZIP genes displayed collinearity with 9 AtHD-ZIP, thereby establishing 13 orthologue gene pairs, suggesting a relatively high degree of homology among the HD-ZIP genes of sunflowers and Arabidopsis.
Collinearity analysis (a) Collinearity analysis of sunflower HD-ZIP genes. Gray lines suggest all synteny blocks in the sunflower genome, and the red lines indicate segmental duplicated HD-ZIP gene pairs. (b) Collinearity analysis of HD-ZIP genes between sunflower and Arabidopsis. Gray lines in the background indicate the collinear blocks within the sunflower and Arabidopsis genomes, while the red lines highlight collinear pairs of HD-ZIPs.
Transcription factor binding site prediction
A total of 53 unique profiles (HD-ZIP motifs) were discerned by investigating the Arabidopsis HD-ZIP binding site profiles in the JASPAR CORE database. As a result of analyzing the HD-ZIP motifs obtained from JASPAR on the upstream regions of the entire sunflower genome, it was found that, among the 53 profiles associated with the Arabidopsis HD-ZIP transcription factors binding sites, 24 profiles exhibited a notable presence with varying frequencies across the sunflower genome. The highest frequency was identified in MA1327.1, characterized by a length of 22 nucleotides, which corresponds to 8156 binding sites. The search for HD-ZIP profiles within the sunflower genome revealed a total of 15,577 binding sites for HD-ZIP transcription factor, spread across 9,479 distinct genes within the sunflower whole genome sequences. The distribution of binding sites across genes ranged from one to five, as detailed in Supplementary Table S5.
protein-protein interaction analysis
The STRING tool was employed to assess the protein-protein interaction network of the HaHD-ZIP proteins (Fig. 6). The findings indicated that the HaHD-ZIP proteins were categorized into three distinct clusters. The initial cluster, denoted in red, constituted the most substantial group, comprising 17 proteins, followed by the second cluster, which encompasses 12 proteins represented in green. The tertiary cluster, represented in blue, consists of eight proteins. Within the first cluster, A0A251U614 emerged as the central hub protein, while HaHD-ZIP48 and LBD1 were identified as the main interactive nodes in the second cluster, and HaHD-ZIP34 served as the hub protein in the third cluster.
RNA extraction and qRT-PCR analysis
To investigate the expression levels of identified HaHD-ZIP genes, eight HaHD-ZIP genes were selected to study their expression levels in the leaves under drought stress utilizing the quantitative polymerase chain reaction (qPCR) methodology. Based on the constructed phylogenetic tree of HaHD-ZIP and AtHD-ZIP proteins, in conjunction with evaluating promoter elements, two HaHD-ZIP genes from each phylogenetic cluster were selected for inclusion in the qPCR experiment. The findings indicated that in contrast to HaHD-ZIP4, HaHD-ZIP8, and HaHD-ZIP14, which exhibited up-regulation following two weeks of drought stress, the remaining five genes (HaHD-ZIP6, HaHD-ZIP15, HaHD-ZIP33, HaHD-ZIP39, and HaHD-ZIP41) demonstrated down-regulation under the same condition. The HaHD-ZIP14 gene exhibited the highest expression with a 6.6-fold increase in expression, followed by the HaHD-ZIP4 gene, which had a 4.1-fold enhancement in expression level. Conversely, the least expression was associated with the HaHD-ZIP41 gene. Furthermore, the expression level of HaHD-ZIP8 showed only a slight increase under 20% PEG treatment compared to the control (Fig. 7).
Discussion
The sunflower (Helianthus annuus), an oilseed crop cultivated globally, is recognized for its moderate resistance to drought stress31. The HD-ZIP gene family, a group of transcription factors, is exclusively found in the plant kingdom and plays a crucial role in plant growth, development, and responses to abiotic stressors. Despite extensive research on the HD-ZIP family across various plant species, there exists a paucity of information concerning this gene family in sunflowers. In the present investigation, a total of 55 HaHD-ZIP genes were identified within the sunflower genome, in contrast to 48 in Arabidopsis thaliana, 46 in Triticum aestivum, 43 in Solanum tuberosum, 63 in Populus trichocarpa, and 88 in Glycine max7,10,16,17,20. Consequently, it can be inferred that the number of HD-ZIP genes does not exhibit a direct linear correlation with the genome size of the aforementioned species. A phylogenetic tree was constructed using 55 HaHD-ZIP and 48 AtHD-ZIP genes, classified into four distinct subfamilies based on their structural similarities. Each subgroup contained distinct conserved domains. For example, the MEKHLA domain is exclusively present in subfamily III, and this conserved motif architecture corroborates the classification outcomes concerning HaHD-ZIP genes. Furthermore, the structure of genes within the same subgroup demonstrated more similarity than that of other groups. As a result, the observed variations among the groups may influence the functional divergence of HD-ZIP subfamilies in the sunflower and provide insight into the evolutionary relationships of the identified 55 HaHD-ZIP genes. Gene duplication represents an evolutionary process that leads to the emergence of novel genes, thereby enabling organisms to adapt to a multitude of environmental conditions32,33. Genes that share the same phylogenetic branch exhibit similar functions in a multi-species phylogenetic tree34. The arrangement of their sequences within the phylogenetic tree alongside those from Arabidopsis could infer the possible functions of these HD-ZIP proteins. Gene expression analyses and transgenic plant studies have demonstrated that ATHB5 (AT5G65310) and ATHB7 (AT2G46680), two representatives of the Arabidopsis HD-ZIPI subfamily were downregulated and upregulated, respectively, under water-deficit conditions7,35. This investigation indicates that these two AtHD-ZIP genes are positioned within the same subfamily as HaHD-ZIP41 and HaHD-ZIP33, suggesting that these genes may exhibit similar functional attributes. The expression level of HaHD-ZIP41 was observed to decline under drought stress. Previous studies have illustrated that the HD-ZIP II protein HAT1 (AT4G17460) is instrumental in regulating drought tolerance in Arabidopsis. Furthermore, it has indicated that HAT22 (AT4G37790), another member of the Arabidopsis HD-ZIPII subfamily, exhibited up-regulation in expression under drought stress conditions4,36. The HaHD-ZIP8 exhibited a slight increase in drought stress. The overexpression of the HD-ZIPIV subfamily member HDG11 (AT1G73360) in wild-type Arabidopsis leads to an augmentation of drought tolerance37. The HaHD-ZIP14 demonstrated substantial up-regulation in drought stress. To further investigate the expression profiles of HaHD-ZIP genes under drought stress, the selected HaHD-ZIP genes, containing MBS and TC-rich repeat cis-acting elements in their promoter, were examined. Cis-elements are crucial to the regulation of gene expression38,39. TC-rich repeats have been recognized as cis-elements, which are inducible by dehydration and drought stress conditions40. Both the MBS elements, associated with drought stress response, and the TC-rich repeat elements were identified within the HaHD-ZIP4 gene, which belonged to subfamily III. This gene exhibited the highest expression level relative to other studied genes. Previous investigations reported that the HAHB4, a member of the HD-ZIPI subfamily in sunflowers, exhibiting similarity to ATHB7 and ATHB12 genes in Arabidopsis, is positively regulated by drought, ethylene, and abscisic acid (ABA)41. Furthermore, it has been determined that the overexpression of the sunflower HaHB1 gene promotes enhanced drought tolerance42. Based on the collinearity analysis performed in our study, HaHD-ZIP50, and HaHD-ZIP53, referred to as HAHB4 and HAHB1 in previous studies, were identified as orthologous to the ATHB12 and ATHB13 genes in Arabidopsis, respectively. These findings imply that the HaHD-ZIP4 and HaHD-ZIP14 may be pivotal in regulatory responses to drought stress in the sunflower.
Conclusion
This investigation elucidated the presence of 55 HaHD-ZIP genes within the Helianthus annuus genome. Phylogenetic analysis categorized these genes into four discrete subfamilies, a finding further corroborated by analysis of their conserved motifs and structural composition. Numerous cis-elements were identified within the promoter regions of these genes, several of which play crucial roles in abiotic stress responses. The HaHD-ZIP genes were additionally subjected to an analysis of their chromosomal positioning and syntenic relationships. Moreover, analysis of expression profiles of selected HaHD-ZIP genes under water deficit stress conditions suggested that HaHD-ZIP4 and HaHD-ZIP14 may have pivotal roles in the plant’s responses to drought stress. In summary, HaHD-ZIP4 and HaHD-ZIP14 genes may have a positive regulatory role in drought stress, offering key candidate genes for enhancing resistance in sunflower breeding programs. This research contributes novel insights into the functional characterization of HD-ZIP genes in the sunflower.
Methods
Identification and sequence analysis of HD-ZIP genes in sunflower (Helianthus annuus L.)
The sequences of Arabidopsis HD-ZIP proteins were acquired from The Arabidopsis Information Resource (TAIR, http://www.arabidopsis.org/). The Helianthus annuus genome database (https://phytozome-next.jgi.doe.gov/info/Hannuus_r1_2) was searched to identify HD-ZIP proteins using BLASTP with the 48 HD-ZIP protein sequences of Arabidopsis as query sequences. The hidden Markov model (HMM) profile of the conserved HD domain of homeobox (PF00046) and the leucine zipper (LZ) domain (PF02183) sequences were retrieved from the PFAM database43. These profiles were utilized to analyze all Helianthus annuus protein sequences using the HMMER search tool44. Subsequently, all obtained protein sequences underwent further examination using NCBI-CDD45 and SMART46 to validate the HD and LZ domains. The final data were obtained after removing redundant sequences. The physicochemical properties, including the molecular weight (Mw) and isoelectric point (pI) of proteins, were derived from the ProtParam online programs47. The subcellular localization of the HD-ZIP proteins was predicted using WoLF PSORT48 and Deeploc49.
Multiple sequence alignment and phylogenetic analysis
To understand the phylogeny and evolution of the HaHD-ZIP gene family, we constructed two phylogenetic trees using MEGA 11 software (https://www.megasoftware.net)50. One tree specifically targeted the HaHD-ZIP gene family, while the other tree encompassed the HD-ZIP genes of Helianthus annuus alongside the HD-ZIP gene family of Arabidopsis. We aligned all protein sequences using Muscle and constructed a phylogenetic tree using the neighbor-joining (NJ) method. This process involved using the Poisson model, pairwise deletion, and 1,000 bootstrap replicates. We utilized the online tool Itol51 to improve the visualization of the phylogenetic tree and enhance its visual appeal.
Analysis of gene structures and conserved domains
The intron-exon structures of sunflower HD-ZIP genes were analyzed using the online Gene Structure Display Server52 based on the coding sequence length (CDS), the corresponding genomic sequence in FASTA format, and the Newick file provided by MEGA v11. The utilization of these tools ensured the maintenance of phylogenetic order for HaHD-ZIP. The conserved motifs in sunflower HaHD-ZIP TFs were analyzed using the MEME program53 under the specified parameters, which included an optimum width ranging from 6 to 50 amino acids, an allowance for any number of motif repetitions, and a maximum of 10 motifs.
Chromosomal location
The chromosome information obtained from Phytozome and the gene locations of HD-ZIPs on sunflower chromosomes were determined using MapGene2Chrom Webserver (MG2C v2.1) software (http://mg2c.iask.in/mg2c_v2.0/)54 and TBtools v2 (https://www.xiaohongshu.com/)55. TBtools provided the chromosome lengths and positions of HD-ZIP genes, which were then input into the MG2C software for visualization.
Cis-Regulatory elements analysis
To analyze the promoter sequences of the HD-ZIP genes, 2000 base pairs upstream of the 5’UTR were obtained from the Phytozome. These sequences underwent analysis using the PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/)56 to identify the cis-acting regulatory elements in HaHD-ZIPs. The distribution map of the promoter elements was generated using TBtools software.
Syntenic, collinearity, and ka/ks analysis
The intra-species collinear relationship and gene duplication events in HD-ZIP members were analyzed using the Multiple Collinearity Scan toolkit (MCScanX) and Circos in TBtools software. Additionally, to investigate the collinear relationship between orthologous HD-ZIP genes of Helianthus annuus and Arabidopsis thaliana, TBtools’ Dual Synteny Plotter was used to generate synteny analysis maps. Tandem duplication events were defined as chromosomal regions containing two or more genes within 200 kb. In contrast, segmental duplication refers to the duplication of genes on different chromosomes or within the same chromosome, but not one following the other29. To calculate the synonymous (Ks), nonsynonymous (Ka) mutation rates, and Ka/Ks values of the duplicated HD-ZIP gene pairs, the Simple Ka/Ks Calculator within TBtools was used.
Transcription factor binding site prediction
To identify the binding site of HD-ZIP transcription factors on the sunflower whole genome the 1,000-base pair upstream region of sunflower genes was retrieved using the RSAT Plants, software57. The Arabidopsis HD-ZIP position weight matrix (PWM) was obtained in MEME format from the JASPAR CORE database58. The upstream regions of each gene were scanned against position weight matrices (PWMs) to identify the HD-ZIP transcription binding site. This was achieved using the Find Individual Motif Occurrences (FIMO version 5.5.7) tool59 with a specified p-value threshold of less than 1e-6. The FIMO tool was used to scan the sequences corresponding to each HD-ZIP motif obtained from the JASPAR database.
protein-protein interaction analysis
To evaluate the HaHD-ZIP proteins’ PPI network, the HaHD-ZIP protein sequences were given to the STRING online database60 to construct interactive networks. The minimum required interaction score was set to 0.4, and after creating the network, the k-means clustering algorithm was selected.
Plant materials, growth conditions, and abiotic stress treatments
The seeds of the inbred line of Helianthus annuus L. were planted in pots filled with a mixture of 50% perlite and 50% soil in a glasshouse at the University of Tabriz, Iran, at 26 ± 1 °C day/21 ± 1 °C night temperature, with a 16-hour photoperiod and an 8-hour dark period. After growing four true leaves, the seedlings were exposed to different concentrations of polyethylene glycol-6000 (0% and 20% w/v). Two weeks later, the leaves of the sunflower seedlings were collected, immediately frozen in liquid nitrogen, and then stored at −80 ◦C for further use.
RNA extraction and qRT-PCR analysis
Total RNA was isolated from H. annuus leaves using an RNX-Plus kit (SINACLON, Iran). The quantity and purity of RNA were assessed using a Nanodrop (NanoDrop Onec Thermo Scientific USA) instrument. One µg of RNA was synthesized into cDNA using a reverse transcription kit. Eight HaHD-ZIP genes and Actin as the reference gene were used for quantitative real-time polymerase chain reaction (qRT-PCR). Their primers were designed using Primer3web v4.1 software (https://primer3.ut.ee/), and their specificity was confirmed using the NCBI Primer-BLAST tool. qRT-PCR was performed with a volume of 20 µL per well using YTA SYBR Green qPCR Master Mix 2X Kit, with a real-time ABI StepOnePlus apparatus. Three-step qRT-PCR amplification conditions were set: 95º C for 3 min, 40 cycles at 95º C for 10 s (the exact annealing temperature depended on the involved primers) for 10 s, and 72º C for the 20s. Three replicates were set in each reaction. The expression level of genes was calculated using 2−∆∆Ct methods. All primer sequences are listed in Supplementary Table S6.
Data availability
The data generated or analyzed in this study are included in this manuscript. Other materials that support the findings of this study are available from the corresponding author upon reasonable request or supplementary information file.
References
United Nations. Population division of the department of economic and social affairs of the United Nations Secretariat (2024). https://population.un.org/wpp/
Baillo, E. H., Kimotho, R. N., Zhang, Z. & Xu, P. Transcription factors associated with abiotic and biotic stress tolerance and their potential for crops improvement. Genes 10, 771 (2019).
Mantri, N., Patade, V., Penna, S., Ford, R. & Pang, E. Abiotic stress responses in plants: present and future. Abiotic stress responses in plants: metabolism, productivity and sustainability, 1–19 (2012).
Li, Y. et al. The roles of HD-ZIP proteins in plant abiotic stress tolerance. Front. Plant Sci. 13, 1027071 (2022).
Capella, M., Ribone, P. A., Arce, A. L. & Chan, R. L. In Plant Transcription Factors113–126 (Elsevier, 2016).
Roodbarkelari, F. & Groot, E. P. Regulatory function of homeodomain-leucine zipper (HD‐ZIP) family proteins during embryogenesis. New Phytol. 213, 95–104 (2017).
Ariel, F. D., Manavella, P. A., Dezar, C. A. & Chan, R. L. The true story of the HD-Zip family. Trends Plant Sci. 12, 419–426 (2007).
Mukherjee, K. & Burglin, T. R. MEKHLA, a novel domain with similarity to PAS domains, is fused to plant homeodomain-leucine zipper III proteins. Plant Physiol. 140, 1142–1150 (2006).
Henriksson, E. et al. Homeodomain leucine zipper class I genes in arabidopsis. Expression patterns and phylogenetic relationships. Plant Physiol. 139, 509–518 (2005).
Hu, R. et al. Genome-wide identification, evolutionary expansion, and expression profile of homeodomain-leucine zipper gene family in Poplar (Populus trichocarpa). PloS One. 7, e31149 (2012).
Harris, J. C., Hrmova, M., Lopato, S. & Langridge, P. Modulation of plant growth by HD-Zip class I and II transcription factors in response to environmental stimuli. New Phytol. 190, 823–837 (2011).
Elhiti, M. & Stasolla, C. Structure and function of homodomain-leucine zipper (HD-Zip) proteins. Plant Signal. Behav. 4, 86–88 (2009).
Zhu, Y., Song, D., Sun, J., Wang, X. & Li, L. PtrHB7, a class III HD-Zip gene, plays a critical role in regulation of vascular cambium differentiation in Populus. Mol. Plant. 6, 1331–1343 (2013).
Turchi, L., Baima, S., Morelli, G. & Ruberti, I. Interplay of HD-Zip II and III transcription factors in auxin-regulated plant development. J. Exp. Bot. 66, 5043–5053 (2015).
Nakamura, M. et al. Characterization of the class IV homeodomain-leucine zipper gene family in Arabidopsis. Plant Physiol. 141, 1363–1375 (2006).
Chen, X. et al. Genome-wide analysis of soybean HD-Zip gene family and expression profiling under salinity and drought treatments. PloS One. 9, e87156 (2014).
Yue, H. et al. Genome-wide identification and expression analysis of the HD-zip gene family in wheat (Triticum aestivum L). Genes 9, 70 (2018).
Qiu, X. et al. Genome-wide identification of HD-ZIP transcription factors in maize and their regulatory roles in promoting drought tolerance. Physiol. Mol. Biology Plants. 28, 425–437 (2022).
Zhang, C. et al. Genome-wide analysis of the homeodomain-leucine zipper (HD-ZIP) gene family in Peach (Prunus persica). Genet. Mol. Res. 13, 2654–2668 (2014).
Li, W. et al. Genome-wide identification and characterization of HD-ZIP genes in potato. Gene 697, 103–117 (2019).
Wei, M. et al. Genome-wide characterization and expression analysis of the HD-Zip gene family in response to drought and salinity stresses in Sesame. BMC Genom. 20, 1–13 (2019).
Agalou, A. et al. A genome-wide survey of HD-Zip genes in rice and analysis of drought-responsive family members. Plant Mol. Biol. 66, 87–103 (2008).
Debaeke, P., Casadebaig, P. & Langlade, N. B. New challenges for sunflower ideotyping in changing environments and more ecological cropping systems. OCL 28, 29 (2021).
Nezami, A., Khazaei, H. R., BOROUMAND, R. Z. & Hosseini, A. Effects of drought stress and defoliation on sunflower (Helianthus annuus) in controlled conditions. (2008).
Najafi, S., Sorkheh, K. & Nasernakhaei, F. Characterization of the APETALA2/Ethylene-responsive factor (AP2/ERF) transcription factor family in sunflower. Sci. Rep. 8, 11576 (2018).
Li, J. et al. Genome-wide characterization of WRKY gene family in Helianthus annuus L. and their expression profiles under biotic and abiotic stresses. PloS One. 15, e0241965 (2020).
Riccucci, E. et al. Genome-wide analysis of WOX multigene family in sunflower (Helianthus annuus L). Int. J. Mol. Sci. 24, 3352 (2023).
Li, W., Zeng, Y., Yin, F., Wei, R. & Mao, X. Genome-wide identification and comprehensive analysis of the NAC transcription factor family in sunflower during salt and drought stress. Sci. Rep. 11, 19865 (2021).
Holub, E. B. The arms race is ancient history in arabidopsis, the wildflower. Nat. Rev. Genet. 2, 516–527 (2001).
Hurst, L. D. The ka/ks ratio: diagnosing the form of sequence evolution. Trends Genet. 18, 486–487 (2002).
Liu, X. & Baird, W. V. Differential expression of genes regulated in response to drought or salinity stress in sunflower. Crop Sci. 43, 678–687 (2003).
Bowers, J. E., Chapman, B. A., Rong, J. & Paterson, A. H. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422, 433–438 (2003).
Cannon, S. B., Mitra, A., Baumgarten, A., Young, N. D. & May, G. The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis Thaliana. BMC Plant Biol. 4, 1–21 (2004).
Liu, M. et al. Genome-wide analysis of the NAC transcription factor family in Tartary buckwheat (Fagopyrum tataricum).BMC genomics. 20, 1–16 (2023).
Söderman, E., Mattsson, J. & Engström, P. The Arabidopsis homeobox gene ATHB-7 is induced by water deficit and by abscisic acid. Plant J. 10, 375–381 (1996).
Tan, W. et al. Transcription factor HAT1 is a substrate of SnRK2. 3 kinase and negatively regulates ABA synthesis and signaling in Arabidopsis responding to drought. PLoS Genet. 14, e1007336 (2018).
Yu, H. et al. Activated expression of an Arabidopsis HD-START protein confers drought tolerance with improved root system and reduced stomatal density. Plant. Cell. 20, 1134–1151 (2008).
Hapgood, J. P., Riedemann, J. & Scherer, S. D. Regulation of gene expression by GC-rich DNA Cis‐elements. Cell. Biol. Int. 25, 17–31 (2001).
Sheshadri, S., Nishanth, M. & Simon, B. Stress-mediated cis-element transcription factor interactions interconnecting primary and specialized metabolism in planta. Front. Plant Sci. 7, 1725 (2016).
Niu, L. et al. The GATA gene family in chickpea: structure analysis and transcriptional responses to abscisic acid and dehydration treatments revealed potential genes involved in drought adaptation. J. Plant Growth Regul. 39, 1647–1660 (2020).
Manavella, P. A., Dezar, C. A., Ariel, F. D., Drincovich, M. F. & Chan, R. L. The sunflower HD-Zip transcription factor HAHB4 is up-regulated in darkness, reducing the transcription of photosynthesis-related genes. J. Exp. Bot. 59, 3143–3155 (2008).
Ebrahimian-Motlagh, S. et al. JUNGBRUNNEN1 confers drought tolerance downstream of the HD-Zip I transcription factor AtHB13. Front. Plant Sci. 8, 2118 (2017).
Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).
Potter, S. C. et al. D. HMMER web server: 2018 update. Nucleic Acids Res. 46, W200–W204 (2018).
Wang, J. et al. The conserved domain database in 2023. Nucleic Acids Res. 51, D384–D388 (2023).
Letunic, I., Khedkar, S. & Bork, P. SMART: recent updates, new developments and status in 2020. Nucleic Acids Res. 49, D458–D460. https://doi.org/10.1093/nar/gkaa937 (2020).
Gasteiger, E. et al. Protein Identification and Analysis Tools on the ExPASy Server (Springer, 2005).
Horton, P. et al. WoLF PSORT: protein localization predictor. Nucleic Acids Res. 35, W585–W587 (2007).
Thumuluri, V., Almagro Armenteros, J. J., Johansen, A. R., Nielsen, H. & Winther, O. DeepLoc 2.0: multi-label subcellular localization prediction using protein Language models. Nucleic Acids Res. 50, W228–W234 (2022).
Tamura, K., Stecher, G. & Kumar, S. (Oxford Academic Oxford, UK, (2021).
Letunic, I. & Bork, P. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).
Hu, B. et al. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics 31, 1296–1297 (2015).
Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).
Chao, J. et al. MG2C: A user-friendly online tool for drawing genetic maps. Mol. Hortic. 1, 1–4 (2021).
Chen, C. et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol. Plant. 13, 1194–1202 (2020).
Lescot, M. et al. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in Silico analysis of promoter sequences. Nucleic Acids Res. 30, 325–327 (2002).
Santana-Garcia, W. et al. RSAT 2022: regulatory sequence analysis tools. Nucleic Acids Res. 50, W670–W676 (2022).
Rauluseviciute, I. et al. JASPAR. : 20th anniversary of the open-access database of transcription factor binding profiles. Nucleic Acids Research 52, D174-D182 (2024). (2024).
Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).
Szklarczyk, D. et al. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 51, D638–D646 (2023).
Acknowledgements
The authors gratefully acknowledge Dr. Ahmad Yari Khosroushahi from Tabriz University of Medical Sciences, for his invaluable assistance and technical supports in qRT-PCR experiments. Without his unwavering helps and supports, this research would not have been possible. We are also thankful from University of Tabriz, Iran for their supports during this research.
Author information
Authors and Affiliations
Contributions
M.N. conceived and designed the study and revised the manuscript; M.N. and Y.H.A.F. performed the experiments; Y.H.A.F. analyzed the data and wrote the manuscript; S.A.M. contributed to interpretation of the results and plant material providing. All authors read and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The material of Sunflower used in the study complies with relevant institutional, national, and international guidelines and legislation. Plant material used is not wild species, and permission for its use is not required.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ahangarani Farahani, Y.H., Norouzi, M. & Mohammadi, S.A. Genome-wide identification and expression analysis of HD-ZIP gene family in sunflower (Helianthus annuus L.) under water deficit stress. Sci Rep 15, 34544 (2025). https://doi.org/10.1038/s41598-025-17883-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-17883-5