Introduction

Rice (Oryza sativa L.) is one of the most vital food crops in the world, and an enhancement in its yield might potentially ease the pressure on world food production. Stresses are the major environmental constraints that adversely affect rice productivity. Thus, improvement of stress tolerance might significantly increase rice yield. Deciphering the genetic regulatory network by which plants detect and communicate cellular signals in the activation of adaptive responses is of prime target in stress-responsive breeding strategies in crops. Transcription factors (TFs) are the key components in these gene regulatory networks that interact with their target genes to control spatiotemporal gene expression and potential function. Rice is one of the suitable models where several TFs have been intensively studied for the functional and transcriptional genomics studies for dissecting the regulatory cellular networks of stress responses in crops1,2,3,4.

TFs are critical regulators of gene expression, playing pivotal roles in plant growth, development, and stress responses. Among the various families of TFs, the myeloblastosis (MYB) family is one of the largest and well-studied TF families in plants5,6,7. In rice, MYB TFs have been extensively studied for their roles in regulating responses to abiotic stresses such as drought, salinity, cold, and heat. These TFs modulate the expression of stress-responsive genes, enabling plants to adapt to adverse environmental conditions8,9. Drought is one of the most significant abiotic stresses affecting rice productivity. MYB TFs play a crucial role in enhancing drought tolerance by regulating genes involved in water retention, osmotic adjustment, and antioxidant defense10. OsMYB1R1 negatively regulates drought tolerance, and its overexpression reduced drought resistance in rice11. OsMYB2 has been shown to improve drought tolerance by upregulating the expression of stress-responsive genes, leading to increased accumulation of compatible solutes like proline and enhanced antioxidant enzyme activities12. Overexpression of OsMYB48-1 enhances drought tolerance by regulating genes involved in the ABA signaling pathway, which is crucial for stress responses13.

Salinity stress affects rice growth by causing ion toxicity and osmotic stress. MYB TFs help mitigate these effects by regulating ion homeostasis and osmotic adjustment14. OsMYB91 enhances salt tolerance by modulating the expression of genes involved in ion homeostasis and osmotic adjustment15. Overexpression of OsMYB55 improves salt tolerance by regulating genes involved in reactive oxygen species (ROS) detoxification and maintaining cellular ion balance16. OsMYB2 plays a regulatory role in salt tolerance of rice12.

Cold stress can severely impact rice growth and yield. MYB TFs also contribute to cold tolerance by regulating the expression of cold-responsive genes17. OsMYB3R-2 TF enhances cold tolerance by regulating the expression of genes involved in the biosynthesis of protective compounds and membrane stabilization18. Overexpression of OsMYB4 improves cold tolerance by upregulating the expression of genes involved in osmoprotectants and antioxidant biosynthesis19. Overexpression of novel MYBS3 confers cold tolerance with no yield penalty in rice20. Overexpression of the osmyb30 mutant increased cold sensitivity, while osmyb30 knockout showed increased cold tolerance in rice through the negative regulation of the MYB genes with the connection of interacting protein OsJAZ921. It was identified that OsMYB2 encodes a MYB transcription factor that plays a regulatory role in tolerance of salt, cold, and dehydration stress in rice12.

Heat stress can lead to protein denaturation and membrane damage. MYB TFs help mitigate these effects by regulating the expression of heat shock proteins (HSPs) and other heat-responsive genes5. In addition to the role in salt stress, OsMYB55 also contributes to heat stress tolerance by regulating the expression of HSPs and other heat-responsive genes16. Overexpression of OsMYB55 resulted in improved growth under high temperature stress and suppressed the negative effect of heat stress on grain yield22.

MYB TFs are integral to the rice plant’s ability to withstand and adapt to various biotic stresses. They achieve this by regulating a wide array of stress-responsive genes and pathways, thereby enhancing the plant’s resilience to adverse conditions23. Understanding the roles of these TFs can aid in the development of stress-tolerant rice varieties through genetic engineering and breeding approaches. OsMYB4 and OsMYB30 enhance resistance to fungal pathogens (e.g., Magnaporthe oryzae) by promoting lignin deposition and pathogenesis-related (PR) gene expression24,25. Jasmonic acid (JA) is a key defense hormone that is essential for plants to resist chewing insects and necrotrophic pathogens. Studies have shown that the expression of OsJAMyb is induced by JA, and its overexpression in rice showed stronger resistance to rice blast after inoculation26. This indicates that OsJAMyb plays an important role in rice resistance to rice blast. Methyl Jasmonate (MeJA) can induce the expression of the PR protein OsPR1A, which enhances the resistance of rice to bacterial blight caused by Xanthomonas oryzae. MYB TFs in rice act as master regulators of stress adaptation, integrating hormonal, metabolic, and transcriptional networks to combat diverse environmental challenges. Their dual role in abiotic and biotic stress responses highlights their potential as targets for breeding or engineering climate-resilient rice varieties.

Researchers have been working to develop resistant or tolerant plants against abiotic and biotic stresses. However, finding new key MYB TFs is a difficult and time-consuming process, as well as committing to the architecture of molecular networks requires assembling information on TFs, their genomic targets, and corresponding binding sites. Web resources in bioinformatics help to integrate information on TFs27. To make tasks easy for breeders, 183 MYB TFs (123 MYB genes and 60 MYB-related genes) were selected, and 12 key genes (KGs) were sorted out by analyzing five CytoHubba algorithms, including degree, closeness, bottleneck, MCC (maximal clique centrality), and MNC (maximum neighborhood component) in this study. To investigate protein-protein interactions (PPIs) and molecular mechanisms in abiotic and biotic stresses, multiple researchers have proposed various sets of KGs28,29,30,31. The stress-causing KGs are not always consistent across various published literature. Nevertheless, none of these studies have inspected the mechanisms that act against other key gene-mediated target proteins. It is understood that more KGs are needed to investigate more potential-tolerant genes for abiotic and biotic stresses. The KGs analysis can lead to the identification of key regulators and pathways involved in stress reactions, which can inform the development of abiotic and/or biotic stress-tolerant rice varieties through targeted breeding or genetic engineering.

Materials and methods

Collection of MYB transcript factors (TFs)

MYB and MYB-related genes in rice are important due to their regulatory roles in plant responses to both abiotic and biotic stresses. In this study, MYB TFs in rice were retrieved from literature reviews and web tools, and a total of 183 important TFs (123 MYB genes and 60 MYB-related genes) were selected from the Grass Regulatory Information Server (GRASSIUS) web tool (https://grassius.org/species/Rice) (Table S1). The protein-coding sequence of MYB TFs was adopted from the RiceData (https://ricedata.cn/gene/) as well as GRASSIUS.

Protein-protein interactions (PPIs) network analysis and sorting out the KGs

The PPIs network of MYBs was prepared by using the STRING-v11.5 (https://string-db.org/) software with a confidence score of > 0.62 and an extreme number of interactors of 032. Cytoscape v3.9.1 and CytoHubba help identify key genes by analyzing PPI networks using topological algorithms to find the most central and influential genes. The Cytoscape-v3.9.1 program was used to figure out the PPI networks33,34. A plugin for Cytoscape to find KGs, five CytoHubba algorithms, including degree35, bottleneck36, closeness37, MCC, and MNC38, was applied, and finally sorted out 12 KGs for further detailed analysis in this study. These CytoHubba algorithms identify central genes in the PPI network, which are likely to play crucial roles in regulating abiotic and biotic stress responses by acting as key regulators or hubs in stress-related signaling and metabolic pathways.

Basic characteristics of KGs

To determine the basic and molecular features of the selected 12 KGs, the RiceData and ExPasy (https://www.expasy.org/protparam/) online ProtParam tools were used to identify the chromosome location, chromosome strand, genomic and coding sequence length, protein length, molecular weight, and isoelectric point39. Expected subcellular localizations were also determined by using the WoLF PSORT (http://psort.hgc.jp/) online server.

Gene structure analysis of KGs

A phylogenetic tree was prepared by the full-length protein sequences of KGs using MEGA-v11.0. Then the tree was visualized via the iTOL-v6 (https://itol.embl.de/) software, which was made with neighbor joining and 1000 repeat bootstrap ways40. Important functional domain sites were analyzed utilizing the conserved domains database (CDD) and PROSITE (https://prosite.expasy.org/) server41. Bioinformatics tools, TBtools-II and MEME program, were used to identify the common conserved motifs in 12 KGs42. The coding CDS and intron structure, along with the conserved protein domain of KGs, were also determined with the gene structure display server (GSDS-v2) (http://gsds.gao-lab.org/) to expand a well-known understanding of structural diversity using proteins, CDS, and genomic sequences43.

Phytohormonal cis-acting element analysis

To identify the cis-elements in the promoter sequence of the rice KGs, the upstream sequences (2000 bp) of 12 KGs were extracted from the rice genome, and were uploaded to the PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/plant) to identify cis-elements44.

Syntenic pair analysis

The protein-coding sequences of rice, maize, sorghum, sugarcane, and brachypodium were collected from the GRASSIUS website. The syntenic pair was determined through the Circoletto tool (http://tools.bat.infspire.org/circoletto/) by using FASTA protein sequences45.

Protein structure analysis of KGs

The three-dimensional (3-D) structure of proteins with coverage and identity analysis was also accomplished by using SWISS-MODEL (https://swissmodel.expasy.org/interactive), a homology-based modeling software46. To evaluate the stereochemical quality by measuring the residues of the Ramachandran plot of the protein model, PROCHECK methods were used47.

Expressolog tree analysis of the KGs

Expressolog tree analysis of the KGs was obtained from the Bio-Analytic Resource for Plant Biology (BARPB) (https://bar.utoronto.ca/) database. Log2 [1 + FPKM (treatment)/1 + FPKM (control)] was calculated using FPKM (Fragments Per Kilobase of transcript per Million mapped reads) to indicate the fold change in the gene expression level.

Gene ontology enrichment analysis of the KGs

To better understand the mechanism and progression of the substantially differentially expressed KGs, we conducted GO enrichment analyses of the 12 genes. The GO pathways were prepared using STRING-v11.5 (https://string-db.org/) online tool32.

KGs expression analysis

Expression patterns of the 12 KGs in rice responses to abiotic and biotic stresses were analyzed. The gene expression data were collected from the mentioned RNA-seq database (http://ipf.sustech.edu.cn/pub/plantrna/) of rice48. Log2 [(1 + FPKM (treatment)/1 + FPKM (control)] was calculated using FPKM to indicate the fold change in the gene expression level using RNA sequencing data.

Plant treatment for gene expression analysis

The rice cultivar Taichung 65 (T65) (Oryza sativa L.) was used for heat and salt treatment for relative gene expression analysis by qRT-PCR. Germinated rice seedlings were grown in an artificial climate incubator (HP 1500 GS-B) with a 14 h light/10 h dark photoperiod and cultured in a nutrient solution for 7 days. For heat treatment, half of the 7-day-old rice seedlings were exposed to a high temperature (44 °C), while the other half were kept at a normal temperature (26 °C) as the control treatment. For salt treatment, the 7-day-old seedlings were treated with 120 mM NaCl, and no treatment was used as a control. After 0.5, 3.0, and 12 h of treatment, the second leaves and roots from the heat- and salt-treated plants, as well as control plants, were collected, rapidly frozen in liquid nitrogen, and stored at -80 °C for subsequent RNA extraction. Three independent biological replicates were used for each set of samples.

RNA extraction and relative gene expression analysis

Total RNA from the root and leaf samples was extracted by using an RNAprep Pure Plant Plus Kit (Tiangen Biotechnology Co., Ltd., Beijing, China) following the manufacturer’s instructions, and the concentration of RNA was determined using the spectrophotometer (Meixi, China). The first-strand cDNA synthesized by reverse transcription was used for subsequent quantitative qRT-PCR analysis. For the cDNA synthesis, 1.5 µg of total RNA was used with the PrimeScript™ II 1st Strand cDNA Synthesis Kit (Tiangen Biotechnology Co., Ltd., Beijing, China). The sequences of specific primers utilized for qRT-PCR analysis were designed by Premier Primer 5.0 software and synthesized by Sangon Bioengineering (Shanghai, China) Co., Ltd, and are listed in Table S2. The qRT-PCR was accomplished via a 2× HSYBR qPCR mix (Zoman Biotechnology Co., Ltd., Beijing, China) on a Light Cycler-480 Real-Time System (Roche, Basel, Switzerland) following the previously described procedure49. The relative expression level of target genes was determined using the comparative threshold (ΔΔCT) approach and OsActin as the internal control with three biological replicates, followed by Zou et al.50.

Data analysis

The data significance was assessed using one-way ANOVA, followed by Duncan’s multiple range test for post-hoc analysis. The analysis was performed using IBM SPSS Statistics 26 software, and a p-value of less than 0.05 was considered statistically significant.

Results

KGs identification from meta data

The PPIs network of the 183 MYBs TFs was created using the STRING-v11.5 database, and the Cytoscape-v3.9.1 software was used to figure out the PPIs networks (Fig. S1). To find out the KGs, topological CytoHubba computational methods including degree, bottleneck, closeness, MCC, and MNC were analyzed, and the 12 KGs were sorted out including OsMYB91, OsMYB103, OsMYB124, OsMYBR5, OsMYBR11, OsMYBR17, OsMYBR21, OsMYBR51, OsMYBR62, OsMYBR63, OsMYBR67, and OsMYBR72 (Fig. 1). Their detail information is summarized in the Table 1.

Fig. 1
figure 1

Top 12 KGs in rice. The PPIs network was created using the STRING-v11.5 database, and the Cytoscape-v3.9.1 software was used to figure out the KGs. Five CytoHubba computational methods, including degree, bottleneck, closeness, MCC, and MNC, were analyzed to identify the KGs. KGs; key genes, MCC; maximal clique centrality, MNC; maximum neighborhood component, PPIs; protein-protein interactions.

Table 1 Features of the 12 KGs selected by CytoHubba computational methods.

Basic features of the KGs

Chromosome localization analysis of rice KGs was accomplished to clearly understand their dispersal on the chromosomes. The results demonstrated that the 12 KGs were localized dispersal region as three genes in chromosome (chr.) 1 (OsMYBR11, OsMYBR5, and OsMYBR17), one in chr. 2 (OsMYBR21), three in chr. 6 (OsMYBR51, OsMYBR62 and OsMYBR63), one in chr. 7 (OsMYB91), one in chr. 8 (OsMYBR67), two in chr. 9 (OsMYB103 and OsMYBR72), and one in chr. 12 (OsMYB124) (Fig. 2A). The physiochemical components indicated that the eight KGs have negative (-) strand and four KGs have positive (-) strand (Fig. 2B), length of the genomic sequences ranged from 1393 to 6010 (Fig. 2C), coding sequences ranged from 522 to 1944 base pairs (Fig. 2D), and the number of amino acids varied from 174 to 648 (Fig. 2E). The relative molecular weight ranged between 19.32 and 66.94 kDa (Fig. 2F), and the isoelectric point was between 4.98 and 10.38 (Fig. 2G).

Fig. 2
figure 2

Basic traits analysis of the KGs. A; chromosomal location of the 12 KGs on chromosomes 1–12 in rice. B; chromosome strand, C; genome sequence length, D; CDS length, E; protein length in amino acid (aa), F; molecular weight in Kilo Dalton (KDa), G; isoelectric point. CDS; coding sequence.

Gene structure analysis of the KGs

The 12 KGs were divided into three subfamilies (Clade I to Clade III) through phylogenetic relationship analysis to show the evolutionary links among these KGs families. The largest group was in subfamily Clade III with five KGs, whereas, subfamily Clade I and II had three and four KGs, respectively (Fig. 3A). Conserved domain analysis showed a total of 15 functional domain architecture sites were present, and OsMYB103, OsMYB124, OsMYBR17, and OsMYBR91 have two functional sites, and the other eight KGs have a single functional site (Fig. 3B). The conserved motifs of the 12 KGs were recognized, and 65 conserved motifs were found (Fig. 3C). Within the 12 KGs, there were 1–10 conserved motifs. Maximum motifs were found in OsMYBR72 and OsMYB124 KGs, and minimum motifs were found in OsMYBR17 and OsMYB103 (Fig. 3C). The width of the 10 conserved motifs varied from 13 to 49 amino acid residues. For gene structure analysis of KGs by using the gene structure display server, most, if not all, genes exhibited more than 1 intron and/or a single intron, whereas OsMYB103 has no intron, respectively. Moreover, the OsMYBR17 genes have a single intron (Fig. 3D). Gene structure analysis can provide insight into the evolutionary relationships among the 12 KGs families.

Fig. 3
figure 3

Gene structure analysis of the 12 KGs. A; phylogenetic relationship analysis of KGs within rice. The subfamilies of KGs are highlighted with different colored backgrounds. B; domain architecture analysis of KGs in rice. Various color codes represent the main function, and the conserved cellular functions of KGs are listed on the right side of the figure. The scale on the lower parts indicates the total amino acids (protein length). C; conserved motif distributions of KGs in rice. Each gene displays a number that matches the motif numbers, and the motif lengths for KGs are displayed proportionally. D; gene structure display server (GSDS) analysis of KGs in rice. The yellow cases indicated the CDS, the gray lines denoted the introns, and the mild green boxes indicated the exon number. The scale provided at the bottom can be used to assess the size of the introns.

Phytohormonal cis-acting element analysis

We analyzed all KGs, and in total, 88 cis-elements were identified among the 12 KGs, ranging from two to fifteen in each protein (Fig. 4). Among the 88 cis-elements, the Abscisic acid responsive elements KGs are OsMYB103, OsMYB124, OsMYBR11, OsMYBR17, OsMYBR21, OsMYBR51, OsMYBR62, OsMYBR63, OsMYBR67, and OsMYBR72. The Methyl Jasmonic acid responsive elements KGs are OsMYB124, OsMYBR5, OsMYBR11, OsMYBR17, OsMYBR51, OsMYBR62, OsMYBR63, OsMYBR67, and OsMYBR72. The Auxin responsive elements KGs are OsMYB91, OsMYB124, OsMYBR5, OsMYBR11, OsMYBR17, OsMYBR62, OsMYBR63, and OsMYBR67. The Gibberellic acid responsive elements KGs are OsMYB91, OsMYB124, OsMYBR5, OsMYBR11, OsMYBR21, OsMYBR62, OsMYBR63, and OsMYBR67. The salicylic acid-responsive elements KGs are OsMYB91, OsMYBR5, OsMYBR21, OsMYBR62, and OsMYBR67.

Fig. 4
figure 4

Phytohormonal cis-acting elements analysis of the KGs. Different color boxes indicated the different types of phytohormonal cis-acting elements such as Abscisic acid responsive elements, Methyl Jasmonate acid responsive elements, Auxin responsive elements, Gibberellic acid responsive elements, and Salicylic acid responsive elements. The scale provided at the bottom can be used to assess the size of the upstream promoter sequences (2000 bp).

Syntenic pair analysis

To explore the advancement forms of the KGs, syntenic relationship analysis was accomplished between rice 12 KGs and other MYB family genes from four different plant species. A total of 55 collinear gene pairs were pointed out for maize (Zea mays), 45 for sorghum (Sorghum bicolor), 45 for sugarcane (Saccharum officinarum), and 61 for brachypodium (Brachypodium distachyon) (Fig. 5A-D). For OsMYB124, maximum sequences producing significant alignments of 41 hits produced genes with maize species, 34 hits produced genes with brachypodium species, 31 hits produced genes with sorghum species, and 25 hits produced genes with sugarcane species. The KGs OsMYBR67 and OsMYBR72 have no syntenic pair with sugarcane and maize. In addition, the genes had more than one pair, representing the potential of these genes during evolution.

Fig. 5
figure 5

Syntenic pair analysis of the KGs. A; syntenic pairs between rice and maize MYB family TFs. B; syntenic pairs between rice and sorghum MYB family TFs. C; syntenic pairs between rice and sugarcane MYB family TFs. D; syntenic pairs between rice and brachypodium MYB family TFs. Bd, Brachypodium distachyon; Sb, Sorghum bicolor; So, Saccharum officinarum; TFs, transcription factors; Zm, Zea mays.

Protein structure analysis of the KGs

Protein 3-D structure provides evidence regarding protein functional properties and molecular orientation. The 3-D structure analysis of the 12 KGs shows that all KGs have a monomer structure (Fig. 6A). Model evaluation with GMQE (Global model quality estimate) values proposed a good model quality, and GMQE analysis results showed that the OsMYBR17 protein has the highest (0.77) GMQE value among the 12 KGs. Protein coverage analysis of the nine KGs showed that 100% similar to target sequences except OsMYB124 (99% similar), OsMYBR62 (96% similar), and OsMYB91 (92% similar) (Table 2). The protein structure was determined by the PROCHECK system using the Ramachandran plot for the superiority and accuracy of KGs (Fig. 6B). Most amino acids are located in allowed regions, and the maximum Ramachandran-favored residues were 84.21% for OsMYBR17 and the minimum 61.49% for OsMYBR5 (Table 2). To evaluate and validate the protein model, the QMEAN Z-scores of the 12 KGs were analyzed. The QMEAN Z-scores were − 0.97, -1.11, -3.30, -4.47, -1.74, -2.77, -0.46, -2.17, -1.42, -0.21, -1.74, and − 2.77 for OsMYBR63, OsMYBR67, OsMYBR72, OsMYB124, OsMYB21, OsMYBR11, OsMYB103, OsMYBR5, OsMYBR17, OsMYB91, OsMYBR51, and OsMYBR62, respectively (Fig. 6C). The specific Z-scores related to the interaction likely between only Cβ atoms provide information about each amino acid in a protein. Z-score analysis revealed that 11 KGs have a less than 2 Z-score values and one KG has less than 1 (Fig. 6D). A higher Z-score indicates that the native fold is significantly more stable than the misfolded structures, suggesting that the protein is likely to fold correctly under stress conditions. In protein structure modeling, the Z-score is used to compare the quality of predicted models against experimental structures. A Z-score greater than zero typically indicates that the model is of high quality and close to the average high-resolution crystal structure.

Fig. 6
figure 6

Protein 3-D structure and protein model validation of the 12 KGs. A; 3-D structural view of KGs. All proteins have a monomer structure. B; for the superiority and accuracy of coordinates for stereochemical characteristics, a Ramachandran plot was used by the PROCHECK server. The green area amino acid region represents the most favored residue, the pea green represents the additional residue, dark gray represents the generously allowed, and white represents the disallowed. The N-Cα bond is known as phi, and the Cα-C bond is psi. C; structural assessment of KGs using QMEAN Z-scores, where QMEAN symbolizes qualitative model energy analysis. The residues’ standardized Cβ (C-beta) possible energy was determined by using the short-range statistical possibilities. D; Z-score of the 3-D structure of KGs. The degree to which template proteins and proteins to be tested match can be represented using the Z-scores, and for evaluating the quality of a model, QMEAN is a complete scoring function. 3-D, three-dimensional.

Table 2 Protein structural GMQE value, coverage, identity, and Ramachandran plot analysis of molprobity score, favored residues, clash score, and outlier regions of the KGs.

Expressolog tree analysis of the KGs

Expressolog tree analysis is a method used to analyze homologous gene expression pattern similarity across species. It identifies homologous genes that are more functionally related by combining sequence similarity and expression pattern similarity. OsMYB124 showed maximum expressologs and OsMYBR63 showed minimum expressologs with other plant species (Fig. 7). OsMYB91, OsMYB103, OsMYBR17, OsMYBR51, and OsMYBR72 have no exdpressologs. OsMYB124 and OsMYBR121 revealed the stress-related expressolgs, and others are related to development with Arabidopsis. By analyzing the expression patterns of homologous genes in different species, it is possible to study changes in gene function during evolution. Expressolog tree analysis is a powerful tool that combines sequence and expression data to identify more functionally related homologs and is widely used in plant geneology and evolutionary biology studies.

Fig. 7
figure 7

Expressolog tree analysis of the KGs. The yellow background highlight indicates the highest expression profile similarity in respective species (expressologs). The blue color represents the development-related relationships, and the pink color represents the stress-related relationships.

GO enrichment analysis of the KGs

The GO functional enrichment analysis revealed that KGs are expressively enriched with an ample number of biological processes (BPs), molecular functions (MFs), and cellular components (CCs) (Table S3). The GO enrichment analysis of KGs was significantly upregulated, and the most important roles among the GO-BP items are GO:0010556; regulation of macromolecule biosynthetic process, GO:0031326; regulation of cellular biosynthetic process, GO:0010468; regulation of gene expression, GO:0080090; regulation of primary metabolic process, GO:0051171; regulation of nitrogen compound metabolic process, and GO:0050794; regulation of cellular process (Fig. 8A). Similarly, the enriched GO-MF items are GO:0003700; DNA-binding transcription factor activity, GO:0003677; DNA binding, GO:0043565; sequence-specific DNA binding, GO:0000976; transcription cis-regulatory region binding, and GO:0005488; binding activity (Fig. 8B). Also, the enriched GO-CC items are GO:0005634; nucleus (Fig. 8C). The most important functions of GO enrichment (GO-BP, MF, and CC) items with count in network, strength, signal, and false discovery rate were documented in Table S4.

Fig. 8
figure 8

GO enrichment analysis of the KGs. Diverse colors specify different numbers of KGs for three dissimilar combinations (A; BP, B; MF, and C; CC) of enriched KGs. The ordinate reflects the enrichment score enriched in GO, which is represented on the right side, down side shows the signal, and the right side represents the enrichment pathway. For significant data, maximum FDR < = 0.05, minimum signal > = 0.01, and minimum strength > = 0.01.

Analysis of KGs expression responses in abiotic stresses

In rice, the MYB gene family plays an important role in response to abiotic stresses, such as salt stress, cold stress, drought stress, and high temperature stress. The combined stress (heat, salt, drought, and cold) analysis of all KGs showed significantly differential expression level (Fig. 9A). Among the 12 KGs, the FPKM (Fragments Per Kilobase of transcript per Million mapped reads) values for OsMYBR67, OsMYB124, OsMYBR72, OsMYBR63, OsMYBR17, and OsMYBR11 were maximum (Fig. 9B). The highly abiotic stresses responsible KGs are OsMYBR72, OsMYBR67, OsMYBR62, OsMYBR51, OsMYBR17, and OsMYBR11 for cold stress, OsMYBR63, OsMYBR21, and OsMYBR5 for drought stress, OsMYB124, and OsMYB91 for salt stress, and OsMYB103 for heat stress. These results suggested that the 12 KGs are stress-induced and likely involved in rice responses to abiotic stresses.

Fig. 9
figure 9

Analysis of KGs expression responses in abiotic stresses. A; gene expression under abiotic stresses. B; FPKM distribution in abiotic stresses. The FPKM (Fragments Per Kilobase of transcript per Million mapped reads) values quantify the expression levels of genes in RNA sequencing data, normalizing for both the length of the gene and the total number of reads in the experiment.

Analysis of KGs expression responses in biotic stresses

In rice, the MYB gene family plays an important role in response to biotic stresses. All of the KGs were significantly expressed for different biotic stresses, like Rhizoctonia solani, Magnaporthe oryzae, Xanthomonas oryzae pv. Oryzae, Rice stripe virus, Brown planthopper, etc., infection (Fig. 10). Among the 12 KGs, the FPKM (Fragments Per Kilobase of transcript per Million mapped reads) values for OsMYB124, OsMYBR5, OsMYBR21, and OsMYBR17 were maximum. These results suggested that the 12 KGs are stress-induced and likely involved in rice responses to biotic stresses.

Fig. 10
figure 10

Analysis of KGs expression responses in biotic stresses. The FPKM (Fragments Per Kilobase of transcript per Million mapped reads) values quantify the expression levels of genes in RNA sequencing data, normalizing for both the length of the gene and the total number of reads in the experiment.

Relative expression of the KGs

The relative expression levels of the 12 key genes in T65 rice cultivars were assessed by qRT-PCR following heat treatment, using the comparative threshold (ΔΔCT) method and OsActin as the internal control. Sampling was performed at three time points: 0.5 h, 3 h, and 12 h, to capture both early and late responses to stress. The results showed a significant upregulation of the 12 KGs, with higher induction observed at shorter heat treatment durations. Under heat stress, 12 KGs exhibited distinct temporal expression patterns over the 0.5 h, 3 h, and 12 h time points (Fig. 11). Several genes (OsMYBR17, OsMYBR63, OsMYBR62, and OsMYBR11) showed strong induction at the earliest time point (0.5 h), followed by a marked decrease at later times. Other genes (OsMYB91, OsMYB103, OsMYB124, OsMYBR67, and OsMYBR72) displayed gradual increases, peaking at intermediate or later time points. A few genes (OsMYBR21 and OsMYBR51) exhibited minimal or modest changes throughout the time course. These findings indicate that the 12 KGs are heat-induced and may be responsive to heat stress in rice, particularly during the early phase of exposure.

Similarly, the relative expression of the 12 KGs at the rice root after salt treatment was evaluated using qRT-PCR. Sampling at 0.5 h, 3 h, and 12 h revealed significant induction of the target genes, with the strongest responses occurring at the shortest treatment duration. Under salt stress, 12 KGs exhibited differential temporal expression over the 0.5 h, 3 h, and 12 h time points (Fig. 12). Several genes (OsMYBR5, OsMYBR11, OsMYBR17, and OsMYBR21) showed strong early induction, which generally declined at later time points. Other genes (OsMYB103, OsMYBR67, OsMYBR72, and OsMYBR62) displayed moderate and sustained increases throughout the time course. A subset of genes (OsMYB91, OsMYB124, OsMYBR51, and OsMYBR63) exhibited minimal or modest changes in expression. These results indicate that the 12 KGs are also salt-responsive, with significant upregulation occurring in response to salt stress. Genes such as OsMYBR5, OsMYBR11, OsMYBR17, and OsMYBR21 showed particularly strong expression at 0.5 h, highlighting their potential roles in salt tolerance in rice.

Fig. 11
figure 11

The relative expression levels of 12 KGs in rice. The relative expression levels of the 12 KGs in leaves at 44 °C for 0.5 h, 3 h, and 12 h. Gene expression was normalized to OsActin. Data are presented as the mean ± SD from three biological replicates. Different lowercase letters denote statistically significant differences at p < 0.05.

Fig. 12
figure 12

Relative expression levels of rice 12 KGs under salt treatment. Expression of 12 KGs in roots after 0.5 h, 3 h, and 12 h of salt treatment. Gene expression is normalized to OsActin. Data are presented as the mean ± SD calculated from three biological replicates. Different lowercase letters denote statistically significant differences at p < 0.05.

Discussion

The MYB TF are involved in plant growth and development, hormone response, cell differentiation, secondary metabolism, and adaptation to abiotic and biotic stresses51,52, as well as, they play vital roles in protecting rice against abiotic and biotic stresses. Although several MYBs have been described in rice, the KGs in rice response to both abiotic and biotic stresses remain elusive. In this investigation, we selected 183 TFs, and a PPI network of these TFs was created using STRING-v11.5. database, where known and expected protein/gene sequences are listed32. It is well known that direct and indirect connections between proteins/genes are linked by the PPIs network53, and the Cytoscape-v3.9.1 software was utilized to figure out this connection. Cytoscape is an open-source tool used for data integration and visualization of molecular interaction networks33,54,55. The PPIs network describes the relationship between two or more proteins and is essential for understanding cellular processes56. To sort out the KGs, CytoHubba algorithms, including degree, closeness, bottleneck, MNC, and MCC, were analyzed, and 12 KGs were sorted out (Table 1; Fig. 1). The KGs are highly connected within gene interaction and play a crucial role in biological networks34. Using CytoHubba algorithms, six and 15 KGs in rice57, 12 KGs in maize29, 30 KGs in Arabidopsis58, 10 KGs in mustard28, and 11 KGs in cotton were detected31.

The results of the distribution of the 12 KGs on chromosome were significantly different and were not directly related to chromosome length, size, etc., and all chromosomes showed well association among them, which means there are no genome assembly problems (Fig. 2A). We also analyzed the evolutionary relationships among KGs members and divided them into three subfamilies (Fig. 3A). It was suggested that KGs have been highly conserved in plant evolution. Conserved domain architectures unravel the functional significance, and the results showed that 12 KGs are involved in DNA binding (Fig. 3B) and provide a roadmap for understanding protein function during evolution. In rice, MYB TFs function as key DNA-binding regulators that modulate the expression of stress-responsive genes, enhancing tolerance to both abiotic and biotic stresses through interaction with hormonal and transcriptional networks59. It is crucial to determine the conserved domain groups to understand evolutionary trends, and the functional conservation of genes60,61. A total of 65 conserved motifs were found in the 12 KGs (Fig. 3C). The KGs in the identical group had analogous motif ordination patterns, revealing that KGs within the same group have similar conserved functions during evolution62,63. Amino acid residues range from 13 to 49, and the predicted motifs for NHX1 genes range from 29 to 50 amino acid residues64, which is close to our findings. Additionally, we analyzed the gene structure of the 12 KGs, and results revealed that the same group had similar structural patterns (Fig. 3D). A previous report reveals that KGs within the same subfamilies have greater similarities in the gene structures from several subgroups varied considerably from one another, and the gene structure is a crucial determinant of their evolutionary relationships65,66,67.

To shed light on the mechanisms regulating tolerance to stress, it is essential to understand the relationship between MYBs and phytohormones and their roles in responding to plant stress68,69,70,71,72,73. Notably, multiple KGs, such as OsMYB124, OsMYBR11, OsMYBR17, OsMYBR62, OsMYBR63, OsMYBR67, and OsMYBR72, harbor cis-elements responsive to ABA, MeJA, and Aux, suggesting their involvement in cross-talk between different hormone signaling pathways. For instance, OsMYB124, present in all five hormone categories, appears to be a central regulator potentially integrating ABA, MeJA, Aux, GA, and SA signaling. Such multi-hormonal responsiveness may position it as a key transcriptional hub in stress adaptation. Similarly, the presence of auxin and GA responsive elements in OsMYB91 suggests possible roles in growth regulation under fluctuating environmental conditions74. Interestingly, OsMYBR62 and OsMYBR67 also contain all five types of hormone-responsive elements, which could indicate functional redundancy or specialization under distinct stress or developmental contexts75. The presence of SA elements in only a subset of KGs (e.g., OsMYBR5, OsMYBR21, OsMYBR62, OsMYBR67) may reflect their more specific involvement in pathogen response pathways76. ABA and MYB TF enhanced stress resistance through complex signal transduction networks and gene regulation mechanisms in rice77. OsMYB2, this gene is up-regulated under salt, cold, and drought stress, enhancing stress resistance in rice through an ABA-mediated signaling pathway12. Overexpression of OsMYB48-1 significantly enhances drought and salt tolerance in rice, and enhances stress tolerance by regulating ABA signaling and xylem biosynthesis13. OsMYB4 in transgenic Arabidopsis enhances cold and drought tolerance by regulating the ABA signaling pathway78. GA and MYB TF have synergistic regulation mechanism. GA-induced TFs such as GA-MYB can directly bind to Gibberellic acid-responsive elements (GARE) to regulate the expression of genes, thereby affecting seed germination. This synergy is important in regulating rice resistance79. Cross-regulation of salicylic acid can mitigate OsMYB-R1-mediated defense responses under drought and fungal stress80. Overall, the distribution and combination of cis-elements imply a layered regulatory mechanism where individual MYB genes may be differentially modulated by various hormonal cues. This highlights the importance of considering hormone interactions in future functional validation studies of these KGs.

Collinearity relationship analysis helps us to identify regions of genomic rearrangement, hidden connections, and evolution across different species. Our results showed that KGs have multifunctional relationships with other plant species (Fig. 5A-D). The function of MYB TF is conserved and different in different plant species. For example, OsMYB3R-2 in rice is involved in the regulation of tolerance to drought, cold, and salt stress18. This suggests that MYB TFs may regulate growth and stress responses through similar mechanisms in different plants. MYB TFs in rice and other plants (such as Arabidopsis, Triticum aestivum) have evolutionary ties of consanguinity. For example, TaMYB72 in wheat promotes flowering in rice, while certain MYB transcripts in rice have unique functions in regulating seed development and quality81. By comparing the collinear relationship of MYB genes in different plants under stress conditions, we can infer the adaptability of these plants to different stress environments during evolution. For example, the response pattern of the MYB gene of desert populus to drought stress differs from that of other plants, which may be related to its evolutionary course of adaptation to extreme arid environments82.

The GMQE value (which ranges from 0 to 1) is a crucial metric in protein structure modeling, and a higher GMQE value indicates better model quality and accuracy66,83,84. Our results indicated that the GMQE values of 12 KGs range from 0.60 to 0.77 (Table 2), suggesting that the quality of the KGs is quite good and the confidence level for the template is high. Proteins with well-modeled structures (high GMQE values) are more likely to maintain their stability and function under stress conditions66. The Ramachandran plots showed most amino acid residues had phi/psi angles in the allowed zones, and a small proportion of residues had phi/psi angles in the disallowed zones (Fig. 6B). Generally, residues with a score of less than 60% are thought to be of low quality66. The Ramachandran plots favored residue ranges from 61.49% to 84.21% (Table 2), suggesting potential acceptance for the 12 KGs. The Ramachandran plots provide insights into protein conformational space and can help assess the impact of stress on protein stability. QMEAN Z-scores are a combined scoring system that assesses the quality of protein structure. When the QMEAN Z-score value is close to 1, it indicates that the modeled structure is of higher quality, and low-quality values are indicated by scores of -4.0 or lower. The results demonstrated that the KGs QMEAN Z-scores (Cβ atoms) value is lower than − 4.0 except OsMYB124 (-4.47) (Fig. 6C), indicating a well-acceptable structural size. The proposed homology protein model for KGs was more accurate and reliable, and these QMEAN Z-scores are more likely to maintain stability and function under stress conditions.

Expressolog tree analysis can be used to study the functional and evolutionary relationship of MYB TF in rice and other plant species (Fig. 7). Expression tree analysis is a method of inferring the function of homologous genes by comparing their expression patterns in different species85, and can identify genes with conserved functions in response to stress conditions86. GO enrichment analysis is a popular method for identifying significantly enriched, annotated, and overrepresented roles, classes, and pathways30. The GO functional enrichment analysis revealed that 12 KGs are expressively enriched with an ample number of biological and molecular functions (Fig. 8A–C). The identification of 12 KGs belonging to the MYB and MYB-related transcription factor families highlights the central role these regulators play in orchestrating a wide range of biological processes in plants. As core hubs within the gene regulatory networks, these MYB family members demonstrate significant regulatory influence over essential cellular and metabolic pathways. Their involvement in the regulation of macromolecule biosynthesis, primary metabolism, and nitrogen compound metabolism underscores their importance in maintaining cellular homeostasis and supporting growth and development. Furthermore, their role in regulating gene expression and broader cellular processes suggests that MYBs are not only pivotal in normal developmental programming but also serve as critical nodes for integrating environmental signals into transcriptional responses. This central positioning within regulatory networks reflects the evolutionary conservation and functional versatility of MYB transcription factors, reinforcing their potential as targets for genetic improvement of traits such as stress tolerance, nutrient efficiency, and metabolic reprogramming in crops87.

The molecular functions associated with the identified MYB and MYB-related transcription factors further emphasize their pivotal role in transcriptional regulation. Their enrichment in activities such as DNA-binding, sequence-specific DNA binding, and transcription cis-regulatory region binding confirms their core function as modulators of gene expression. These transcription factors recognize and bind specific DNA motifs within promoter regions, enabling precise regulation of downstream target genes involved in critical biological processes. The presence of both general DNA-binding activity and more specific sequence-specific binding suggests functional diversity within the MYB family, allowing them to regulate a wide array of genes in response to developmental cues and environmental stimuli. Moreover, their ability to interact with cis-regulatory elements highlights their integral role in constructing dynamic gene regulatory networks that fine-tune plant growth, differentiation, and adaptation. These findings align with the established function of MYB proteins as key regulators in numerous physiological and stress-related pathways88. The GO enrichment found for cellular components is in the nucleus, which has been confirmed by previous studies29.

The expression results in both abiotic and biotic stresses indicated that the 12 KGs were significantly expressed and may have roles in rice responses to abiotic and biotic stresses. These results are consistent with previous studies that the expression levels of 15 KGs in Brassica oleracea, 15 KGs in Oryza sativa, 21 KGs in Ziziphus jujuba, 26 KGs in Arachis hypogaea, 10 KGs in Triticum aestivum, and 4 KGs in Raphanus sativus were significantly elevated in response to abiotic and biotic stresses89,90,91,92,93,94.

In both heat and salt stress treatments, the expression of the 12 KGs was significantly upregulated, with stronger induction observed at the 0.5 h sampling time for both stresses (Figs. 11 and 12). These findings suggest that the 12 KGs are involved in the early stress responses to both heat and salt, with certain genes exhibiting notably higher expression levels in response to each stress. In rice, six key genes were identified by Mas-ud et al.30, and these key genes were highly expressed in rice’s response to heat stress. Fourteen genes were upregulated in rice roots in response to salt stress95. The results indicate that these 12 KGs are stress-induced MYB TFs, and are likely involved in rice’s ability to withstand abiotic and biotic stresses and may serve as important markers for breeding heat- and salt-tolerant rice varieties.

Conclusion

This study was undertaken to find out KGs and identify their functional roles by conducting bioinformatics and molecular biology approaches. We selected 183 MYB TFs in rice and sorted out 12 KGs by analyzing five CytoHubba algorithms, including degree, closeness, bottleneck, MNC, and MCC. The PPIs network analysis indicated that 183 TFs may have various interconnected with each other, and KGs are more interrelated, and these genes have many conserved roles in abiotic and biotic stresses. The collinearity relationship proved that KGs of rice have a noble relationship with other plants, and play a crucial role in understanding the genomic landscape, unraveling gene functions, and evolutionary relationships among species. Protein structure results direct that future investigations of the 12 KGs will deliver a better understanding of the actual or most feasible molecular mechanisms intricate in rice. Expressolog tree analysis indicated that the 12 KGs might have diverse functions in rice responses to development and stress conditions. The enrichment of KGs in transcriptional regulation-related processes and nuclear functions underscores their potential involvement in modulating essential cellular and biosynthetic pathways. The expression results revealed that the 12 KGs is a stress-induced TF and plays critical functions in rice responses to both abiotic and biotic stresses. The KGs expression results confirmed that these 12 KGs are heat and salt-inducible and play crucial roles in rice responses to heat and salt stress. The integration of genome-wide meta-data and bioinformatics analysis has paved the way for an inclusive understanding of the KGs that drive the stress response in rice. By embracing these future trends, researchers can deepen their understanding, expedite tolerant varieties, and ensure global food security in a changing climate.