Introduction

The increasing global population causes a significant challenge to food security as the demand for food rises in tandem. The pandemic, climate change, and growing economic inequalities are key factors contributing to the rise in hunger and food insecurity1,2. The UN predicts that by 2030, over 600 million people globally will be affected by hunger2 Despite the wide variety of edible plant species available, it is remarkable that only 15 crop plants are responsible for supplying 90% of the global population’s food energy intake3. Rice, maize, and wheat alone holds two-thirds of this amount by serving as staples for over 4 billion individuals4. Given the extensive cultivation in vast areas, the stability of those cereal crops can be negatively affected by various factors, including abiotic factors like drought, salinity, temperature and biotic factors such as insects, fungi and bacteria causing the potential risk of significant yield losses5,6. Abiotic stresses can cause significant reduction in crop yield by up to 60% compared to record yield of the corresponding crops7 while a global survey has shown biotic stressor related yield losses average between 17 and 23% on five major crops species8. The current abiotic stressors, increased by changing climate conditions, pose an increased risk of biotic stress-induced yield loss. As plants become more stressed by environmental factors, their ability to resist or recover from biotic threats weakens, making them more vulnerable to damage and further reducing crop yields9. Therefore, the development and adoption of resilient crop varieties is vital to overcome the detrimental effects of changing climatic conditions, pests and diseases.

Wheat holds great significance as a staple crop globally, serving as a primary source of nourishment for a substantial portion of the world’s population5. The global wheat production in 2024 is estimated to be over 797 million metric tons10,11making wheat the second most consumed crop globally, by 2.5 billion people in 89 countries12,13. Persistent threats from pests like the Hessian fly, wheat stem sawfly, and wheat rusts are only a few of the challenges in wheat cultivation14. Additionally, fungal diseases, including rusts and mildew, and pests such as aphids, significantly diminish the yield of wheat cultivars15. Addressing these issues requires continuous research efforts for the development of effective management strategies to secure global wheat production. Fusarium Head Blight (FHB or scab), a fungal disease affecting the wheat flowers, is a serious concern for wheat production, as well as other small grains, caused primarily by the Ascomycete fungus Fusarium graminearum16,17. In recent years, the severity of Fusarium head blight epidemics has become more severe in many wheat-producing regions worldwide, primarily due to the combined effects of global warming, modifications in wheat farming practices, and in agricultural production methods, which created more favorable conditions for the pathogen to thrive and spread18,19. The fungus can survive in crop residue, such as infected wheat stubble, and produce spores that may be spread by wind or rain, and when conditions are favorable, they land on the wheat heads during flowering17,20. The initial infection occurs when the spores germinate on the wheat spikes and penetrate the flowering structures, including the florets21,22. The fungus then colonizes the developing grain, causing several detrimental effects. FHB not only reduce the crop yield but also poses a threat to food safety and health effects23,24,25. The fungus produces mycotoxins, particularly deoxynivalenol (DON), also known as vomitoxin, which can contaminate the grain and pose health risks to humans and animals, which makes it crucial to integrate resilient crop species as a management strategy19,26,27.

Studies have revealed that FHB resistance in wheat is a complex trait controlled by multiple genes. More than 500 quantitative trait loci (QTL) associated with FHB resistance have been reported and mapped in the wheat genome28,29,30 and the most notable QTLs for FHB resistance in bread wheat are located on chromosomes 3BS (Fhb1) and 6BS (Fhb2), responsible up to 46% and 9% effect on resistance, respectively29,31,32,33,. FHB resistance is a polygenic trait34 and in wheat, five mechanisms are described to contribute to FHB resistance controlled by those QTLs35. These five mechanisms consist of the resistance to pathogen infection (referred to as resistance component type I), the resistance to the spread of FHB within the spike (type II)29,36, the insensitivity of wheat lines to the DON accumulation and their capability to degrade the toxin (type III)37, resistance to kernel infection (type IV)38 and tolerance to yield loss (type IV)37,38,3937,38,40.

Within this complex FHB response in wheat, resistance to DON plays a critical role as a major resistance factor which is closely associated with Qfhs.ndsu-3BS QTL41. This region harbors a significant quantitative trait locus (QTL), derived from the Chinese Spring wheat cultivar Sumai 3, which imparts resistance to FHB and has been thoroughly studied16,33,42,43,44,45,46. The genes Fhb1, Fhb2, Fhb3, Fhb6, and Fhb7 provide Type II resistance, while Fhb4 and Fhb5 confer Type I resistance against scab47. It has shown combination of multiple FHB resistance genes was more effective and reliable in enhancing disease resistance compared to individuals carrying single FHB resistance genes, for instance, the stacking of Fhb1 and Fhb2 resulted in enhanced resistance against FHB47. Recently two inhibitors of Fhb1 gene, the major contributor of FHB resistance have been reported in wheat48, creating a challenge for breeders because the effectiveness of Fhb149.

Traditional breeding methods, coupled with molecular markers including restriction fragment length polymorphism (RFLP), random amplified polymorphic DNA (RAPD), diversity arrays technology (DArT), amplified fragment length polymorphism (AFLP), simple sequence repeat (SSR), and single nucleotide polymorphism (SNP) have been used to identify FHB-resistant wheat cultivars28,33,50,51,52. However, the challenges arise from several complexities associated with genetic traits, marker development, inhibitors and the genetic background of the breeding materials in pyramiding multiple resistance genes to enhance overall resistance53,54.

In plants, gene regulation holds an important role in maintaining cellular homeostasis during environmental stress. A key component of this regulation is noncoding RNAs (ncRNAs), RNA molecules that do not encode proteins but play essential regulatory roles in cells. Two prominent classes of ncRNAs are microRNAs (miRNAs) and long noncoding RNAs (lncRNAs). MicroRNAs are small, endogenous molecules, typically around 22 nucleotides and regulate gene expression through translational repression or degradation of the targeted mRNA55. MiRNAs are involved in numerous biological processes, including development, cell differentiation, metabolism, and stress response56,57. LncRNAs are noncoding transcripts typically longer than 200 nucleotides, sharing characteristics with mRNA but lacking protein-coding ability. In plants, lncRNAs primarily regulate gene expression through transcriptional regulation and alternative splicing58,59,60,61. The conservation of lncRNAs among species is poor compared to the high conservation of plant miRNAs. One way lncRNAs indirectly regulate gene expression is by interacting with miRNAs where lncRNAs act as “competitive endogenous” molecules by mimicking the mRNA targets of miRNAs. This interaction sequesters miRNAs, preventing them from binding to their target mRNAs and thereby indirectly upregulating the expression of those target genes62,63,64.

Through their intricate involvement in gene regulation, ncRNAs contribute to the plant’s ability to recognize and respond to Fusarium infection. Recent research has highlighted the crucial role of both miRNAs and lncRNAs in modulating plant immune responses, particularly in the context of fungal diseases65,66,67. A study investigating wheat’s response to powdery mildew and stripe rust identified 246 differentially expressed long intergenic ncRNAs (lincRNAs), with over 1,300 small nuclear ribonucleic protein (SnRNP) motifs linked to spliceosome activity. Several of these lincRNAs contained more than 10 SnRNP motifs, suggesting that they are involved in regulating alternative splicing and the expression of immune-related genes in response to fungal infection. These findings underscore the potential of lncRNAs in controlling complex defense mechanisms against fungal pathogens, making them promising candidates for improving wheat’s resistance to fungal diseases68. In another study, 24 miRNAs were found to be differentially expressed between F. graminearum infected and mock-treated wheat spikes at 30 and/or 50 h post-inoculation (hpi), suggesting their involvement in the plant’s response to the pathogen. Among these, six miRNAs were conserved across plants (e.g., miR159, miR160, miR166, miR171), while the remaining miRNAs were species-specific to wheat. Most of the differentially expressed miRNAs were downregulated at 30 hpi and upregulated at 50 hpi, indicating their dynamic role during different stages of infection66. Duan et al. identified over 4,130 lncRNAs that respond to Fusarium graminearum in wheat, with the majority being activated at 12 h post-inoculation, suggesting that the early stage of infection is critical for lncRNA-mediated regulation of defense responses and lncRNAs help trigger innate immune responses, including pathogen-associated molecular pattern (PAMP)-triggered immunity, and activate key pathways such as plant-pathogen interactions and ROS production. These findings highlight the importance of early lncRNA expression in wheat’s defense against FHB, which could pave the way for developing new disease-resistant germplasm and more effective, biosafe fungicides, such as nucleic acid-based treatments65. Here in this study, lncRNAs and miRNAs were identified from two different wheat cultivars, Vida and Hank, using transcriptome data obtained from plants under normal conditions and upon Fusarium graminearum infection. A lncRNA-miRNA network was constructed with the aim of unraveling the potential effect of noncoding RNAs on fungal pathogen response in wheat.

Results

Tiered LncRNA identification approach yields 2,156 distinct clusters based on 90% sequence identity

RNA sequencing and small RNA sequencing data were produced from the two wheat varieties, Hank and Vida, with and without Fusarium infection (Control and Infected, hereafter). A tiered approach involving a stepwise elimination of sequences that did not meet pre-established criteria was applied to identify a refined set of lncRNA candidate transcripts. Mapping of RNA-sequencing data to IWGSC RefSeq Genome v2.1 yielded 78,028, 73,388, 82,669 and 71,208 assembled and non-redundant transcripts for Hank control, Hank infected, Vida control and Vida infected samples, respectively. After all elimination steps, which included discarding transcripts with open reading frame (ORF) size larger than 100 nucleotides, calculating coding potential using CPC2 and CPAT tools69,70, and comparing the remaining transcripts to high-confidence coding sequences obtained from the reference Triticum aestivum Chinese Spring genome, the final identification process revealed the putative lncRNAs (Table S1). This resulted in 1037, 877, 1156 and 932 putative lncRNA transcripts for Hank control, Hank infected, Vida control and Vida infected samples, respectively (Fig. 1A). Putative lncRNAs were compared to lncRNAs from CANTATA (v3.0) and PLncDB (v2.0) databases. For Hank control, 255 and 184 lncRNAs matched to lncRNAs from CANTATA and PLncDB databases, respectively. Of these, 122 matched to lncRNAs from both databases. Similarly, for Hank infected samples, 202 and 152 lncRNAs matched to the respective databases, 94 of which had matches in both databases. For Vida control, 272 and 202 lncRNAs matched to lncRNAs from CANTATA and PLncDB databases, respectively, 125 common to both databases. And for Vida infected samples, 223 and 159 lncRNAs matched to lncRNAs from the databases, with 109 being common to both, suggesting that lncRNAs are only moderately conserved.

Since conservation patterns of lncRNAs, at the nucleotide level, are not yet well-established, to make cross-species comparisons, the predicted lncRNAs were grouped into clusters that share a high level of nucleotide identity. LncRNAs from the same cluster may represent ‘homologous’ lncRNAs in different species. Out of the total number of 4,002 putative lncRNAs, clustering was made using the CD-HIT tool with a threshold of 90% sequence similarity, to identify any homologous lncRNA sequences among different conditions/varieties. This clustering process yielded a total of 2,156 distinct clusters (Fig. 1B) (Table S1).

Fig. 1
figure 1

(A) X-axis represents the total number of transcripts obtained for each sample after mapping RNA-seq data to the reference genome. Dark bars represent the number of eliminated transcripts at each tier of the stepwise filtering process whereas the light bars indicate the remaining number of transcripts, which were identified as putative lncRNA transcripts. (B) The number of shared clusters of lncRNA transcripts across four samples. Each cluster represents a group of lncRNA transcripts that exhibit a 90% sequence similarity. Yellow squares correspond to the Vida variety, and orange squares represent the Hank variety, highlighting the presence of shared lncRNA clusters among these two varieties.

Novel and known miRNA families unique to fusarium-infected wheat samples have a biased chromosomal distribution across different samples

The miRNA prediction by miRmachine from small RNA sequencing data yielded many known and novel miRNA sequences, with a significant increase in infected samples. A total of 41 known miRNA families were identified across all four samples, with variable representation in control and infected groups (Table S2). miRNA families common to all samples and conditions included well-known miRNA families, such as miR156, miR166, miR167 and miR393 (Fig. 2). The control sample of Vida was found to have 14 known miRNA families, while the control sample of Hank exhibited 17 known miRNA families. On the other hand, Vida infected sample displayed a broader diversity, having 35 known miRNA families, and Hank infected sample was found to have 37 known miRNA families. Notably, 14 of those miRNA families were exclusive to the infected samples, highlighting their presence and potential significance in response to the FHB infection. Intriguingly, our analysis did not reveal any miRNA families that were exclusively present in control samples and absent in infected samples (Fig. 2).

Fig. 2
figure 2

Known miRNA families identified in each sample. Yellow squares represent the presence of miRNA family in a Vida variety sample, orange squares represent the presence of the identified miRNA family in a Hank variety sample.

In addition to the known miRNA families, our analysis suggested a high number of novel miRNA sequences, underscoring the importance of these previously unrecognized regulatory elements in the context of our study. Notably, the control sample of Vida variety yielded 73 novel mature miRNA sequences, while the control sample of Hank variety contributed 112 novel mature miRNA sequences to our dataset. However, the most remarkable discovery came from the infected samples, which exhibited a substantial increase in novel miRNA sequences. A total of 694 novel mature miRNA sequences were identified in the Vida infected sample, while Hank infected sample yielded 810 novel mature miRNA sequences. Figure 3 shows the total number of novel and known miRNA sequences in each sample and condition.

Fig. 3
figure 3

Number of total miRNA sequences, including novel and known miRNA sequences, shared among each sample.

LncRNA sequences can carry precursors for miRNA production71, a phenomenon emerging as a potentially new layer of non-coding RNA regulation of gene expression. To identify such lncRNAs, putative lncRNA sequences were subjected to homology-based miRNA prediction by mirMachine, with zero mismatches as previously described72. Six, three, two and two miRNA precursor lncRNA transcripts were identified from Vida control, Vida infected, Hank Control and Hank infected samples, respectively (Table S2). Across all samples, miR1122 was identified as the only common miRNA derived from lncRNA transcripts, while miR1130 was present in all three samples except the infected Hank variety. Moreover, the miR156, miR5181, and miR9863 families were exclusively identified in the Vida control sample, while miR5200 in only infected Hank sample.

The chromosome distribution of the identified putative known and novel miRNA sequences were determined. Chromosomes 3 and 7 were identified as the main sources of miRNA families across wheat in general. However, for the Vida cultivar specifically, chromosome 5D was the primary source of miRNA sequences. In contrast, lncRNAs showed a more even distribution across chromosomes, with chromosomes 3, 5, and 7 equally contributing to the identified sequences. The number of miRNA sequences identified from each chromosome for each sample is given in Fig. 4.

Fig. 4
figure 4

(A) The distribution of identified putative miRNA sequences across chromosomes. (B) The number of putative lncRNA sequences targeted by identified miRNA sequences from each chromosome.

Target prediction provides cues into disease response through target and target mimicry networks

Target analysis was conducted for the identified miRNA sequences obtained from sRNA sequencing data using high-confidence coding sequences of the IWGSC v2.1 RefSeq genome. The psRNATarget results yielded in 2022 high-confidence (HC) coding sequences (CDS) targets for Hank control, 10,639 for Hank infected, 1667 for Vida control, and 8871 for Vida infected (Table S3). The identified target coding sequences were subjected to a BLAST search against all Triticum aestivum UniProt proteins (177,740 sequences). As a result, 4146, 20,800, 3430 and 17,400 protein matches were identified as the targets of identified miRNA sequences in Hank control, Hank infected, Vida control and Vida infected samples, respectively. Gene Ontology (GO) enrichment for the target proteins indicated top biological process (BP) terms were generally related to broad categories of regulation and structural organization for all samples. For Vida control, the first 5 top BPs included GO terms for development specifically, which were not observed in any other samples. GO terms related to ‘response to stress’ usually had a higher rank in infected samples. ‘RNA metabolic process’ was ranked 5, much more prominent in enrichment in Hank infected than in all other samples, whereas in Vida infected ‘multi-organism process’ similarly ranked 5. The ‘regulation of RNA metabolic process’ and ‘macromolecule modification’ terms also ranked higher in Hank infected, compared to other samples. Interestingly, for BP, Hank infected appeared to be enriched in different terms than all other samples. For molecular function (MF), transcription and replication/repair related terms were more enriched in Vida samples, whereas for Hank samples energy-related terms appeared in higher ranks. Receptor and signal transduction activities were also prominent in enriched GO terms for MF in Hank infected. These results indicate different aspects of regulation in response to infection in Vida and Hank.

Furthermore, the miRNA-lncRNA interaction was explored by identifying the lncRNA transcripts targeted by the miRNA sequences (Table S3). This analysis provides valuable insights into the regulatory landscape, as lncRNAs often act as miRNA targets, functioning analogously to mRNA transcripts and thereby influencing miRNA activity through target mimicry. Combining the miRNA-CDS target and miRNA-lncRNA target analysis, it is aimed to create an integrated miRNA-mRNA-lncRNA network to contribute to a more comprehensive understanding of gene regulation. The miRNA sequences that have both lncRNA and CDS targets were identified, and among those miR1130 family was shown to target two CDS targets and one lncRNA target in both Vida control and infected samples, whereas miR1432 was found to have 5 common CDS targets and a lncRNA target identified only in Vida and Hank infected samples (Table S4).

The complexity of miRNA interactions was investigated through two key aspects: the number of miRNAs with multiple predicted targets and the number of predicted targets affected by either one or multiple miRNAs (Fig. 5). The mean count of miRNA families exhibiting multiple predicted CDS targets was assessed and categorized into seven bins for simplicity, revealing a prevalent trend where most miRNA families displayed an inclination to target between 2 and 50 CDS. Notably, only a few instances were observed where miRNA families either focused on a single target or exhibited more than 50 predicted targets (Fig. 5A). In a complementary analysis (Fig. 5B), the average number of predicted CDS targets was investigated, distinguishing targets that were singled out by a lone miRNA family versus those targeted by multiple miRNA families. In stark contrast to miRNAs with multiple targets, individual CDS targets predominantly appeared to be under the influence of a single miRNA family.

Most miRNA families were found to target one lncRNA transcript, mirroring the observed pattern for CDS targets (Fig. 5C). Subsequently, the average number of predicted lncRNA targets was explored, classifying them based on whether they were targeted by only one or multiple miRNA sequences. Once again, the predominant observation was that individual lncRNA targets were predominantly targeted by a single miRNA, consistent with our findings for coding sequences (Fig. 5D). This intricate pattern of miRNA target specificity underscores the exquisite regulatory roles played by miRNAs in modulating gene expression at both the coding and non-coding transcript levels. Likewise, certain CDS or lncRNAs were targeted by multiple miRNAs, indicating the existence of complex regulatory networks.

Fig. 5
figure 5

(A) The average number of miRNA families with multiple predicted coding sequence (CDS) targets. The number of targets was categorized into 7 bins of specific sizes for simplicity. Most miRNA families demonstrated a tendency to target between 2 and 50 CDS, with only a few instances of miRNA families either focusing on a single target or having more than 50 predicted targets. (B) The average number of predicted CDS targets, classified as being targeted by either one or multiple miRNA families. In contrast to miRNAs with multiple targets, each individual target predominantly appeared to be targeted by a single miRNA family. (C) The average number of miRNA families with multiple putative lncRNA targets. Similar to the CDS targets, number of lncRNA targets was distributed into 7 bins of specific sizes. The majority of the miRNA families were identified to target a single lncRNA transcript. (D) The average number of predicted lncRNA targets, categorized based on whether they were targeted by only one or multiple miRNA families. Individual lncRNA targets predominantly targeted by a single miRNA family, consistent with our observations for coding sequences.

Discussion

Wheat production faces formidable challenges, notably from biotic stresses such as FHB, which threaten both yield and quality23. Addressing these challenges necessitates a deep understanding of the molecular mechanisms underlying wheat-pathogen interactions. Exploring the noncoding genome, particularly the intricate interplay of lncRNAs, miRNAs, and mRNAs, is crucial in developing resilient wheat cultivars capable of withstanding biotic stresses and ensuring global food security in the face of evolving agricultural challenges73. In our study, the non-coding RNAs in Vida and Hank wheat varieties following Fusarium infection were identified. Generating the RNA sequencing data obtained from infected and control samples, a comprehensive analysis to identify lncRNAs, to investigate their potential roles in the wheat-Fusarium interaction, were conducted. Additionally, utilizing small RNA sequencing data, miRNAs were identified, enriching our understanding of the intricate regulatory landscape governing the susceptible response in these wheat varieties to FHB.

The significant increase in both known and novel miRNA sequences in Fusarium-infected samples is of remarkable and highlights the potential role of miRNAs in infection-induced regulatory response74. The absence of miRNA families exclusively present in control samples and absent in infected samples suggests that Fusarium infection may trigger a more expansive miRNA regulatory network, rather than downregulating miRNA activity present under normal conditions. A study by Tripathi et al. in 2019 has identified miRNA expression in Arabidopsis thaliana under different environmental conditions and showed that the number and expression levels of most miRNAs decreased under glasshouse conditions, which are typically benign and controlled, compared to plants grown in the field75. The findings suggested that plants in stable conditions require fewer miRNAs for regulation and survival. This implies that in harsh field conditions, where plants face more stress, they need a more expansive miRNA regulatory network to manage the various environmental challenges. This supports the findings in our study that plants under stress, whether from environmental or pathogen-related factors, trigger a more complex miRNA response to ensure proper regulation of their defense mechanisms and overall survival. However, these findings require further validation to confirm their functional roles in pathogen response and to fully understand their impact on wheat’s defense strategies.

The assembly of the RNASeq data yielded a comprehensive set of transcripts across all four samples. Subsequent refinement, using stringent criteria for lncRNA, led to the identification of a subset of putative lncRNA transcripts. Recognizing the non-conserved nature of plant lncRNAs at the sequence level, which was also observed by comparisons against wheat lncRNAs from CANTATA (v3.0) and PLncDB (v2.0) databases, a clustering approach was applied, grouping transcripts with high sequence similarity. Notably, the presence of clusters common across all four samples suggests a core set of lncRNAs that may play conserved roles in the wheat varieties76. The identification of specific lncRNA clusters exclusively in the Fusarium-infected samples may suggest a distinct regulatory signature that emerges during biotic stress77,78.

The prediction of the miR1432 family exclusively in the infected samples, which targets one of the common lncRNA transcripts, highlights a potential role for miR1432 in regulating these lncRNA clusters during pathogen response. Previous observations on the high expression levels of Tae-miR1432 in response to wheat yellow rust infection suggest an exclusive role for this miRNA family in biotic stress responses79. This study identified calcium ion-binding protein family members as the target of Arabidopsis thaliana miRNA ata-miR1432 and our analysis identified EF-hand domain proteins as the target of Tae-miR1432 both in Hank and Vida cultivars upon FHB infection. EF-hand domain proteins are structural motifs found in calcium ion-binding protein family members80. The involvement of miR1432 and calcium ion-binding proteins suggests that the miRNA and its target may contribute to the plant’s vulnerability to the wheat yellow rust pathogen, highlighting the importance of better understanding plant susceptibility for developing targets to improve wheat resistance79. Given the involvement of similar biotic stress pathways in wheat’s defense against yellow rust and Fusarium head blight, identification of miR1432 in the sRNASeq data of infected samples suggests a potentially conserved regulatory role across different wheat-pathogen interactions. Additionally, the identification of the 4D hexose transporter gene as a target of miR1432 connects this regulatory network to hexose transport, a process known to impact pathogen growth. The involvement of hexose transporters, such as the Lr67 gene in wheat, emphasizes the broader importance of this pathway in plant defense, linking miRNA regulation, sugar transport, and pathogen resistance81,82,83.

Another miRNA that is exclusively identified in infected samples is miR398, a highly conserved and well-identified plant miRNA that plays a major role in plant stress response and development in plants84. miR398 involves in maintaining reactive oxygen species (ROS) balance, which its downregulation has shown to be related with decreased ROS accumulation85,86 and overexpression of miR398 in tomato has to compromise plant resistance against Botrytis cinerea fungus, negatively regulating the expression of antioxidant genes87. A study on wheat has shown elevated levels of miR398 in the early stages of Fusarium culmorum infection in the roots, which decrease over time. However, in the leaves, the early levels of miR398 are lower and increase over time88. Vida and Hank wheat cultivars, which are not resistant against FHB, showed expression of miR398 only upon Fusarium infection, supporting the well-characterized role of miR398 upon biotic stress88 and the manipulation of miR398 could serve as a promising strategy for improving Fusarium resistance in wheat.

A few putative lncRNA transcripts were predicted to carry miRNA precursors, giving rise to the biogenesis of miRNA sequences and contributing to the post-transcriptional regulation of gene expression, suggesting a dual functionality for lncRNAs. These lncRNAs were found to harbor precursor sequences for miR156, miR1122, miR1130, miR5181, miR5200, and miR9863 families, some of which are highly conserved and previously validated wheat miRNAs89,90,91. The identification of highly conserved and previously validated wheat miRNAs within these lncRNAs suggests their evolutionary importance and hints at conserved regulatory roles.

Our miRNA-CDS target analysis provided valuable insights into the regulatory landscape governed by miRNAs in the wheat transcriptome. The findings reveal a hierarchical specificity in miRNA targeting, with most miRNA families regulating a limited number of protein-coding transcripts, while individual protein-coding transcripts are predominantly targeted by a single miRNA family. This suggests an imbalance, favoring a high number of transcript targets relative to the number of miRNA sequences. In contrast, lncRNA-miRNA interactions exhibit greater specificity, as most lncRNAs are targeted by a single miRNA family, indicating distinct regulatory dynamics between miRNAs and different RNA classes. Additionally, instances where multiple miRNA families target the same CDS transcript hint at potential cooperative or combinatorial regulation, emphasizing the complexity of these networks. This crosstalk or synergistic regulatory roles among miRNAs may represent an adaptive strategy to fine-tune gene expression during stress responses92. The identification of miRNA teams with overlapping or complementary functions marks them as promising candidates for further investigation into their roles in biotic stress regulation and wheat-pathogen interactions93.

The chromosomal distribution of regulatory elements highlights key genomic regions involved in wheat’s response to Fusarium infection. Chromosomes 3 and 7 have been identified as the major contributors to miRNA families across all samples, suggesting that these regions may serve as hubs for regulatory activity. In the Vida infected sample, chromosome 5D also stands out as a significant source of miRNA sequences, indicating a potentially unique and stress-specific regulatory role. This distribution suggests that certain chromosomal regions may play a pivotal role in the miRNA-mediated responses to biotic stress. The uniform chromosomal distribution of lncRNA sequences across the wheat genome indicates their widespread involvement in diverse cellular processes. However, the specific enrichment of miRNA precursor lncRNAs on chromosomes 3, 5, and 7 suggests that these regions are particularly important in miRNA biogenesis. The overlap in chromosomal origins of miRNAs and their precursor lncRNAs implies a coordinated regulation between coding and non-coding elements, reinforcing the idea of spatially organized regulatory networks in wheat. This coordinated activity underscores the potential for these genomic regions to serve as focal points for future studies on the molecular mechanisms underlying wheat-pathogen interactions.

The CDS target analysis of miRNAs has shown a rich landscape of gene regulation in wheat, particularly during Fusarium infection. Among the identified CDS targets, domains such as NBRARC, F-Box, and Leucine Rich Repeat Containing N-terminal domains stand out as prominent, reflecting their roles within the miRNA-mediated regulatory network (Table S3). These domains are well-known for their involvement in plant defense and stress response pathways, suggesting that miRNAs may play a critical role in modulating key components of these mechanisms94,95,96. The enrichment of these domains among targeted CDSs highlights their role in pathogen response, emphasizing miRNAs’ potential to regulate stress-response pathways and adapt to biotic stress.

Interestingly, a subset of CDS emerged as exclusive targets in Fusarium infected wheat samples, revealing a unique regulatory signature associated with pathogenic stress. Among these targets, the inclusion of YTH, Pum HD domain, RBR type E3 Ubiquitin transferase, SAC domain, SAP domain, and Jumonji C (JmjC) highlights the diversity of regulatory elements engaged in the plant’s defense against Fusarium. The identification of the Pum gene family among the targeted CDS adds another layer of significance to our findings. Pumilio (Pum) RNA-binding proteins, known for their pivotal roles in stress response and growth, suggest a finely tuned post-transcriptional regulatory mechanism during stress conditions97. Notably, the documented involvement of the Pum genes in kernel development in maize points to a potential link between stress response and kernel development in wheat. Given that FHB impacts both grain yield and protein content98, this connection may provide insights into the broader effects of FHB on yield and quality. These findings underscore the importance of exploring the interplay between stress-related regulatory networks and developmental processes to better understand wheat’s adaptive responses to biotic stress.

Histone methylation and demethylation are pivotal processes governing chromatin formation and gene expression regulation with Jumonji C (JmjC) domain-containing proteins playing a crucial role as demethylases in epigenetic modifications99. In our analysis, wheat JmjC genes, exclusively targeted in Fusarium infected samples, emerge as potential regulators in the intricate landscape of stress response. Their role in modulating chromatin dynamics suggests a direct influence on gene expression during biotic stress. Furthermore, drawing parallels with rice, where JMJ704 acts as a universal switch controlling genes in the bacterial blight resistance pathway, underscores the potential universality of JmjC proteins as key elements of diverse gene networks in response to biotic stress100. These finding suggest that the wheat JmjC genes could serve as integral components in the epigenetic machinery, modulating defense and/or stress adaptation mechanisms during Fusarium infection, warranting further functional studies to explore their regulatory capacities in wheat’s stress responses.

Conclusion

In summary, our study sheds light on the intricate regulatory networks of miRNAs, lncRNAs, and mRNAs in wheat’s response to Fusarium infection. The identification of unique miRNA families, specific lncRNA clusters, and targeted CDS transcripts highlights the complexity and specificity of biotic stress regulation. The enrichment of miRNA precursors and the coordinated activity on chromosomes 3, 5, and 7 underscore the spatial organization of regulatory elements in the wheat genome. The discovery of exclusive targets, such as Pum and JmjC domain-containing genes, further emphasizes the multifaceted regulatory strategies employed during stress responses, linking epigenetic modifications, post-transcriptional regulation, and defense mechanisms.

These findings provide a comprehensive framework for understanding wheat-pathogen interactions at the molecular level, opening avenues for functional validation studies. By elucidating the roles of these regulatory networks, this research lays the groundwork for understanding the functional significance of this regulatory network, stress adaptation and pathogen response mechanisms in wheat.

Methods

Fusarium cultures

Cultures of Fusarium graminearum were maintained on potato dextrose agar and soft nutrient agar (SNA) slants stored at 4 °C, as spores in 30% glycerol at – 30 °C and − 80 °C. SNA is minimal salt sugar medium containing KH2PO4 1gr/l; KNO3 1 g/l; MgSO4* 7H20 0.25 g/l; KCl 0 0.25 g/l; Glucose 0.2 g/l. sucrose 0.2 g/L, agar 20 g/l)49,101.

RNA-sequencing and small RNA-sequencing data and transcriptome assembly

Two wheat varieties with against FHB, Vida and Hank, were used in this study. Hank is an easy-threshing, high-yielding hard red spring wheat that was bred for Hessian Fly tolerance102, whereas Vida is a high-yielding hard red spring with moderate resistance to leaf and stripe rust103. Both varieties, exhibiting low to moderate resistance to Fusarium Head Blight, were selected to focus on identifying noncoding RNAs involved in the plant’s response to fungal infection, providing a baseline response for understanding susceptibility and potential mechanisms for improving resistance, rather than contrasting resistance phenotypes. Plants were grown to the formation of the second internode, FHB fungi were introduced to the plants, excluding the control plants. 3–4 days after the introduction of the fungi, infected and uninfected tissues were collected. The collected tissues were stored at – 80 °C104.

Total RNA was isolated from three biological replicates of control and infected tissues taken from both varieties. Cell walls were broken by grinding 0.2 g of tissue with liquid nitrogen, and then with 1.4 ml Trizol (Invitrogen; Carlsbad, California, USA) which blocks the RNase activity. 400 µl chloroform was added to the ground samples and centrifugation was applied at 11.500 rpm and + 4 °C for 15 min105. The upper liquid phase containing the RNA molecules was taken and mixed with 500 µl isopropanol and a second centrifugation was applied for 10 min to precipitate the RNA molecules. The pellet was washed gently with 75% ethanol which was followed by a final centrifugation step that precipitated the RNAs. The pellet was then re-suspended in 30 µl nuclease-free water and RNA concentrations were analyzed. To remove any gDNA contamination within the RNA samples, the isolated RNAs were treated with DNase and later purified by ethanol precipitation. First, DNase treatment was performed using an Amplification Grade DNase I kit (Sigma-Aldrich, USA) with the manufacturer’s instructions. Then, ethanol precipitation was performed by using sodium acetate, 100% and 75% ethanol, together with centrifugation between each treatment step.

Demultiplexed RNA-Seq and sRNA-Seq data were obtained using Illumina bcltofastq software, from Illumina. CutAdapt tool (version 4.0) was used to trim any adapter sequences and low-quality bases from the RNA reads (adapter_min_overlap = 10 -m 30 -q 20 --max-n 0.01)106. Triticum aestivum cv. Chinese Spring (hexaploid, AABBDD) IWGSC RefSeq v2.1 genome assembly and annotations were downloaded from Wheat URGI website (http://wheat-urgi.versailles.inra.fr/Seq-Repository, accessed on 18 April 2023). Vida control, Vida nfected, Hank control and Hank infected samples RNAseq data were mapped to IWGSC RefSeq v2.1 genome using HISAT2 alignment tool (version 2.2.1) with default parameters107,108. The SAM files produced by HISAT2 were transformed into BAM files using samtools (version 1.3.1)109 and transcript sequences were extracted from processed BAM files using StringTie tool (version 2.2.1)110,111.

In silico long noncoding RNA identification

Potential lncRNAs were computationally identified by sequentially eliminating sequences with known characteristics of other types of RNAs, such as coding mRNAs and non-coding small RNAs, as described earlier71,112. Briefly, as the first step, all known Triticum aestivum tRNAs (5290 sequences), snoRNAs (3435 sequences), snRNAs (1301 sequences) and rRNAs (289,109) were obtained from RNA Central v22 (The RNAcentral Consortium, 2018) and compared to the transcripts using BLAST. A custom python3 script was used to eliminate sequences matching T. aestivum tRNAs, snoRNAs, snRNAs, and rRNAs and resulting in 16,155 remaining RNA sequences. The obtained sequences were subjected to BLAST analysis against the all four RNAseq data using standalone BLAST 2.13.0+ (blastn tool, e-value 1E–10)113. Transcripts that exhibited a query coverage or subject coverage of more than 10% with a known RNA were removed, along with transcripts shorter than 200 nucleotides using a custom Python script. Subsequently, TransDecoder v5.5.0 was used to calculate the potential sizes of ORFs within the transcript sequences, and transcripts with ORFs larger than 100 nucleotides were discarded. As mentioned in our previous paper114, this threshold aligns with the widely accepted understanding that lncRNAs generally lack long ORFs, with most containing short ORFs115,116,117. The coding potential of RNA transcripts indicating their potential of being a protein-coding transcript was predicted using CPC2 (version1.0.1) and CPAT (version 3.0.4). For CPAT training, a hexamer-table and logic model were created using T. aestivum non-coding sequences downloaded from NONCODEv6 (http://www.noncode.org/download.php), T. aestivum Chinese Spring reference annotation v2.1 high confidence coding sequences and the aforementioned small RNA sequences involved a total of 16,155 clean known Triticum RNAs. Noncoding predictions from CPAT were accepted if their probability was below a cutoff of 0.00001. To identify any potential similarities or sequence alignments between them, homology searches were conducted between candidate lncRNAs and Chinese Spring reference annotation v2.1 high confidence coding sequences using BLAST 2.13.0+ (blastn tool, e-value 1E-10) and transcript sequences that exhibited a query coverage of more than 20% with a coding sequence were eliminated. As the final step, a custom script was utilized to consolidate the outputs of these individual steps, culminating in the creation of a hierarchical elimination process. The outputs of the custom python3 script were considered as ‘clean” putative lncRNAs. A sample script incorporating all individual steps are provided as Supplementary File S1. For analyses that assume and explore conservation, which can be considered as ‘homology’, putative lncRNAs were clustered using CD-HIT118,119at a sequence identity threshold of 90% (-c 0.9 -n 8 -d 0).

To assess conservation of lncRNAs, predicted lncRNAs were compared against T. aestivum lncRNAs from PLncDB v2.0 (https://www.tobaccodb.org/plncdb/)120 and T. aestivum, T. dicoccoides and T. turgidum lncRNAs from CANTATA db3.0 (http://yeti.amu.edu.pl/CANTATA/)78,121 using blastn (e-value 1E–10). Minimum 70% sequence identity and minimum 70% lncRNA coverage were considered significant matches to the lncRNAs from the databases.

In silico microRNA identification

Small RNA sequencing data for Vida and Hank varieties control and infected samples were used for miRNA identification. sRNA-seq reads were filtered for adapter sequences and low-quality bases using CutAdapt (adapter_min_overlap = 10 -m 18 -M 34 -q 20 --max-n 0)106. Unique sRNA-seq reads with at least 10 counts were processed further for de novo miRNA identification using mirMachine at the novel miRNA identification mode (--sRNAseq -lmax 24 -lmin 19 -rpm 0)72.

MiRNA precursor prediction from putative LncRNA transcripts

Due to well-known links between miRNAs and lncRNAs, potential miRNA precursor sequences within lncRNAs were sought using miRmachine homology pipeline, using putative lncRNA sequences as input and comparing them to 10,414 Viridiplantae mature miRNA sequences retrieved from miRBase (version 22.1) with zero mismatches71,72. A custom python script was then used to identify miRNA and lncRNA target multiplicity, indicating the number of miRNA sequences targeting each lncRNA transcript and the number of lncRNA transcripts targeted by each miRNA.

Identification of LncRNAs and mRNA targets of identified MiRNAs

Putative mature miRNA sequences were subjected to target analysis using the psRNATarget tool122. This analysis utilized the identified lncRNA sequences as the target for miRNA binding prediction. The following parameters were employed for the psRNATarget analysis: maximum unpaired energy (Max UPE) of 25 and an expectation value (Expectation) of 3. Also, for mRNA target analysis of the identified miRNAs, Chinese Spring RefSeq v2.1 HC CDS was used (Available on http://wheat-urgi.versailles.inra.fr/Seq-Repository, accessed on 18 September 2023). This dataset is used as the target sequences with user-defined options in the psRNATarget tool, with a maximum UPE of 25 and an expectation value of 3. To bridge the gap between coding sequences and protein sequences, the identified target transcript sequences were aligned using BLAST against (blastx, E value 1E-05) the protein sequences of Chinese Spring RefSeq v2.1 genome (Available at http://wheat-urgi.versailles.inra.fr/Seq-Repository, accessed on 18 September 2023). The identified target proteins were then annotated using a database of Triticum aestivum proteins obtained from UniProt (Available on https://www.uniprot.org/uniprotkb?query=triticum+aestivum, accessed on 20 September 2023). Additionally, Gene Ontology (GO) annotations for the target proteins were assigned using the eggNOG mapper server (http://eggnog-mapper.embl.de/), followed by GO enrichment using the agriGO server (https://systemsbiology.cau.edu.cn/agriGOv2/)123,124.