Introduction

Orthosiphon aristatus (Blume) Miq., commonly known as kidney tea or “cats whiskers”, is a perennial herb of the Lamiaceae family. It is widespread in Southeast Asia and Australia1. In China, O. aristatus serves as both a traditional Chinese medicine and an ethnic herbal remedy, especially among minority groups such as the Dai, Miao, Wa, and Lahu2. The plant is widely utilized for the treatment diabetes, urinary tract infections, and other ailments3. Recent studies have highlighted that polyherbal supplements4 and nano formulations containing O. aristatus show promise as effective treatments for diabetes5,6. Extracts of O. aristatus exhibit a spectrum of pharmacological effects such as anti-diabetic, anti-urolithiasis, protection against urinary tract infections and promotion of tissue repair7,8. Phytochemical analyses have identified various constituents within O. aristatus including flavonoids, phenolic acids, fatty acids and terpenoids9,10. Partitioned phytochemical studies suggest that bioactivity varies with the chemical composition of different extracts. Among the identified compounds, flavonoids stand out as key active substances attributed with diuretic, antioxidant, and anti-nephritic properties11,12. In particular, sinensetin and eupatorin are two flavonoidal compounds that have been extensively studied for their wide range of pharmacological activities, including anti-inflammatory, vasodilatory, anti-proliferative, hepatoprotective, analgesic and anti-diabetic effects13,14,15,16. Despite their importance, flavonoids are less abundant in medicinal part of O. aristatus and increasing their yield remains a challenging task.

Numerous research efforts have been directed towards regulating the quality of O. aristatus and increasing its antioxidant content17,18,19,20. Notably, the development of tissue culture techniques has enabled an increase in the production of active secondary metabolites in this plant21. However, researchers in this area are faced with two major shortcomings. First, the biosynthetic process and transcriptome regulatory mechanisms of secondary metabolites in different tissues of kidney tea remain poorly understood. High-quality transcriptome data are essential not only to improve genomic resources but also to facilitate genetic and molecular breeding strategies for metabolic regulation of medicinal plants22,23,24. Secondly, while most studies have focused on the leaves of O. aristatus, the potential of its roots is often overlooked, with roots typically discarded during processing. Unlike previous findings, emerging studies suggest that the non-medicinal parts of O. aristatus possess considerable potential. For example, the root have been shown to contain higher concentrations of total polyphenols and flavonoids compared to the leaves25. In addition, roots and stems exhibit superior free radical scavenging abilities compared to leaf tissues. The morphology of roots and stems appears to influence the accumulation of bioactive compounds such as rosmarinic acid and flavonoids26. Therefore, it is becoming increasingly important to elucidate the differences in the expression of key enzyme genes in three different tissues (roots, stems, leaves). Such research would significantly advance our understanding of the molecular mechanisms that control tissue-specific synthesis of flavonoids.

Comparative transcriptome analysis is a widely employed strategy to identify crucial enzyme genes involved in the biosynthetic pathways of active ingredients in traditional Chinese medicine27,28. By analyzing transcriptome data, researchers have uncovered extensive genetic information about the herb and elucidated key metabolic pathways for its important active ingredients29. Recently, Du and colleagues characterized the chloroplast genome of O. aristatus and identified potential candidate species closely related to it that may serve as promising sources for renal function-improving drugs30. Despite these advances, the biosynthetic regulatory mechanisms, particularly for flavonoids in O. aristatus remain elusive.

In this study, we quantified the levels of four different flavonoids in tissues of O. aristatus: leaves (L), stems (S), and roots (R). RNA sequencing (RNA-seq) technology was used to sequence the kidney tea transcriptome, annotate functional genes, and identify key enzyme genes involved in flavonoid biosynthesis. Differential expression gene (DEG) analysis was performed in these tissues. We also performed Gene Ontology (GO) functional annotation, metabolic pathway enrichment using Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, and characterization of transcription factor families (TFs) within O. aristatus. In addition, Quantitative real time PCR (qRT-PCR) was used to validate the expression levels of differentially expressed genes associated with flavonoid biosynthesis. This comprehensive methodological approach aims to elucidate the molecular mechanisms underlying flavonoid biosynthesis and provide invaluable genetic insights for the future development and exploitation of O. aristatus.

Results

Quantitative analysis, and tissue-specific distribution of flavonoids in O. aristatus

The isolation, structural determination, and canonical plotting of four specific flavonoids as well as rosmarinic acid were successfully executed (refer to Supporting Information, Table S1, and Fig. S1, S2 for details). We quantitatively analyzed the content of these four flavonoids, unveiling significant disparities in flavonoid concentrations across the three examined organs, as depicted in Fig. 1. Notably, the roots contained higher levels of GQU, EUP, and SEN, in contrast to the stem tissues. Consequently, we have employed transcriptome sequencing to delve deeper into the molecular underpinnings of these observed variances.

Fig. 1
figure 1

(A) Tissue of O. aristatus root, stem, and leaf. (B) Content of four flavonoids was 5’-glucose-quercetin (GQU), Eupatorin (EUP), 3’-hydroxy-5,6,7,4’- tetramethoxyflavone (TMF), sinensetin (SEN) respectively. Different letters (a–c) on the bar plots indicate significant difference at p < 0.05 using one-way analysis of variance with Duncan’s multiple-range test.

Illumina sequencing and de novo assembly of the O. aristatus transcriptome

Comparative transcriptome analysis of roots, stems and leaves of kidney tea was conducted via RNA sequencing (Tables 1 and 2). After sequencing, adapter trimming, and the removal of low-quality reads, a high-quality library was obtained with Clean Reads ranging from 42.28 M to 42.58 M and an impressive Q30 value of 93.22% (Table S2). The GC content for all samples fell within the range of 42.03–42.32%. These high-quality reads are suitable for subsequent correlation analyses. The results of the principal component analysis (PCA) showed that the three tissues were clearly separated in the plot, and three biological replicates for each treatment clustered together, indicating all biological replications of each treatment had good repeatability (Fig. S3).

Following transcript trimming and the elimination of redundant sequences, a final set of 138,294 Unigenes was obtained, with an average length of 1251 base pairs and an N50 of 1866 base pairs. The length distribution of the Unigenes is illustrated in Fig. S4. These results will be instrumental for subsequent functional annotation and differential expression gene (DEG) analysis.

Table 1 Summary statistics of transcriptome sequencing data for root, leaf, and stem of O. Aristatus.
Fig. 2
figure 2

(A) Venn diagram illustrating the annotation of Unigenes in the top 5 databases. (B) Similarity comparison of O. aristatus Unigenes with those of other species.

Table 2 O. Aristatus annotated by seven public databases.

Unigene annotation and functional classification across databases

The statistics for the best BLAST hits of Unigenes in various databases are presented in Table S3. Out of the total 138,294 Unigenes, they were successfully aligned with seven major functional databases, including KEGG, GO, NR, NT, SwissProt, Pfam, and KOG. Specifically, the NR database annotated 107,244 Unigenes (77.55%), the NT database annotated 83,122 Unigenes (60.11%), the KOG database annotated 85,464 Unigenes (61.8%), and the KEGG database annotated 84,763 Unigenes (61.29%). In summary, 110,439 Unigenes (79.86%) received functional annotations from at least one of the databases, with 45,415 Unigenes (32.84%) being commonly annotated. A detailed comparison of Unigenes annotated in five different databases is illustrated in Fig. 2A.

The Unigenes sequences of kidney tea were compared to the NR database, and the species with the highest annotation percentage is Salvia splendens, a member of the Lamiaceae family, accounting for 44.17% of the annotations. Following that, Sesamum indicum represents 20.6%, and Handroanthus impetiginosus accounts for 17.18% of the annotations (Fig. 2B). This indicates a significant homology between kidney tea and Salvia splendens, particularly within the Lamiaceae family.

A classification of the 84,763 Unigenes was performed using GO annotation (Fig. 3), categorizing them into three major classes: biological process (BP), cellular component (CC), and molecular function (MF). Based on sequence homology, the categories that represent the most abundant annotations in BP, CC, and MF are as follows: Biological Process (BP): “Cellular anatomical entity” (75,858 Unigenes); Cellular Component (CC): “Metabolism” (44,740 Unigenes); Molecular Function (MF): “Interspecies interaction between organisms” (407 Unigenes).

Fig. 3
figure 3

Gene Ontology (GO) annotation results for O. aristatus Unigenes.

KEGG pathway analysis of unigenes in O. aristatus

Through KEGG analysis, a total of 84,763 Unigenes (61.30%) were annotated to metabolic pathways, which can be categorized into 5 major classes and 19 subcategories. Specifically, these Unigenes were distributed as follows across the major classes: Cellular Processes: 3,689 Unigenes, Environmental Information Processing: 5,868 Unigenes, Metabolism: 44,740 Unigenes, Genetic Information Processing: 17,044 Unigenes, Organismal Systems: 3,714 Unigenes. Within these metabolic pathways, 2,236 Unigenes were further annotated to 15 secondary metabolic biosynthesis pathways. The most highly annotated secondary metabolic pathways included: Phenylpropanoid biosynthesis (ko00940, 362 Unigenes), Flavonoid biosynthesis (ko00941, 97 Unigenes), Isoflavonoid biosynthesis (ko00943, 42 Unigenes), Flavonoid and flavonol biosynthesis pathway (ko00944, 26 Unigenes). This analysis provides insights into the potential metabolic pathways and the involvement of Unigenes in these pathways in kidney tea. Additionally, based on transcriptome data and KEGG pathway analysis, the biosynthesis pathways of flavonoid components, such as naringin, prunin, and kaempferol, were inferred. Specifically, the phenylpropanoid biosynthesis pathway (ko00940) and flavonoid biosynthesis pathway (ko00941) involved a total of 66 Unigenes encoding 8 key metabolic enzymes (Fig. 4, Table S4).

Fig. 4
figure 4

(A) KEGG annotation results for O. aristatus Unigenes. And (B) other metabolic pathways.

Differential gene expression analysis

Fig. 5
figure 5

(A) Venn diagram and (B) bar chart depicting the pairwise comparison of Differentially Expressed Genes (DEGs) in different tissues.

Fig. 6
figure 6

The DEGs significantly enriched in KEGG pathways. (AC) Significantly enriched pathways in the DEGs identified in the comparison between roots vs. stems (A), roots vs. leaves (B) and leaves vs. stems (C).

DESeq software package was utilized to identify Differentially Expressed Genes (DEGs) with a fold change > 2 and a q-Value < 0.05 (Fig. 5A). Across all comparison groups (S vs. R, L vs. S, L vs. R), a total of 5,897 DEGs were identified (Fig. 5B). Among these, the highest number of DEGs was observed in the comparison between L vs. R (upregulated: 24,872, downregulated: 14,390), while the lowest number of DEGs was found in the comparison between L vs. S (upregulated: 18,512, downregulated: 6,291). All DEGs underwent KEGG enrichment analysis to identify potential genes associated with the differential distribution of flavonoids in kidney tea (Fig. 6). Compared to roots and leaves, the top 5 enriched pathways with DEGs were Photosynthesis (250 genes), Flavonoid biosynthesis (223 genes), Porphyrin and chlorophyll metabolism (198 genes), Isoflavonoid biosynthesis (69 genes), and flavonoid and flavonol biosynthesis (39 genes) (Fig. 6A). Between stems and roots, most DGEs were enriched in Photosynthesis (234 genes), Flavonoid biosynthesis (212 genes), Isoflavonoid biosynthesis (68 genes), Flavone and flavonol biosynthesis (40 genes), and Benzoxazinoid biosynthesis (26 genes) (Fig. 6B). In the comparison between leaves and stems, the top 5 enriched pathways were Ribosome, Photosynthesis-antenna protein, Photosynthesis, Plant hormone signal transduction, and Phenylpropanoid (Fig. 6C). The secondary metabolic pathways with the highest DEGs, including phenylpropanoid and flavonoid biosynthesis, were determined among various tissues, particularly between roots and aerial parts.

Candidate genes involved in flavonoid biosynthetic pathway

Integrating transcriptomic data with KEGG pathway analysis, we delineated the flavonoid biosynthetic pathways operational within O. aristatus tissues. Specifically, across two primary pathways, the phenylpropanoid biosynthesis (ko00940) and flavonoid biosynthesis (ko00941), a cohort of 66 Unigenes was identified, each encoding for one of eight pivotal metabolic enzymes involved in these processes (refer to Table 3 and Supplementary Table S5 for details). Within the phenylpropanoid framework, key enzymes including Phenylalanine ammonia-lyase (PAL; 8 Unigenes), 4-coumarate-CoA ligase (4CL; 20 Unigenes), and Trans-cinnamate 4-monooxygenase (CYP73A; 5 Unigenes) were annotated. Progressing downstream in the flavonoid pathway, pivotal catalytic proteins essential for the synthesis of flavanones such as naringin and the flavonol kaempferol were elucidated, comprising Chalcone synthase (CHS; 5 Unigenes), Chalcone isomerase (CHI; 14 Unigenes), Flavonoid 3’-hydroxylase (F3H; 2 Unigenes), and Flavonol synthase (FLS; 7 Unigenes). These findings not only illuminate the gene-to-metabolite nexus but also underscore potential regulatory nodes amenable to biotechnological exploitation for enhancing secondary metabolite production in O. aristatus.

Table 3 Summary of flavonoid biosynthesis Unigenes in three O. aristatus tissues.

These enzymes are crucial for the biosynthesis of flavonoids and play a significant role in the production of flavonoid in O. aristatus. Using homology search and functional annotation, we identified 56 structural enzyme Unigenes potentially regulating the biosynthesis of flavonoid in the phenylpropanoid biosynthesis pathway, and flavonoid biosynthesis pathway (Fig. 7).

A total of 38 Unigenes encoding four structural enzymes in the m-coumaric acid pathway were identified. Expression analysis revealed that 4CL was expressed in all three tissues, while PAL and CYP73A had higher expression levels in the roots, and CHS showed higher expression in the stems. Additionally, we identified 28 Unigenes encoding four enzymes involved in the flavonoid biosynthesis pathway. Among these, CHI was expressed in all three tissues, C12RT1 and F3H were highly expressed in the roots and stems, and FLS was expressed in the leaves. These findings are consistent with flavonoid accumulation, suggesting potential significance for these 28 Unigenes in flavonoid biosynthesis. Interestingly, there was no correlation found between the expression of the F3H gene and flavonoid accumulation. These results provide valuable insights into the expression patterns and potential roles of these enzymes in the biosynthesis of flavonoids, particularly naringin, prunin, and kaempferol, in O. aristatus.

Fig. 7
figure 7

Analysis of the Flavonoid Biosynthetic Pathway in O. aristatus.

Transcription factors (TFs) analysis of DEGs

A total of 5150 genes were annotated into 58 TF families, of which the MYB family has up to 617 genes, Specifically AP2-EREBP (464), bHLH (408), WRKY (352), NAC (241), C2H2(144) (Table S6-7). By performing TF prediction on DEGs involving flavonoid biosynthetic pathway among tissues, we found that R, S and L can be divided into 4452, 4527, and 4316 TF families, respectively (Fig. 8A). Among them, MYB (713°) is the most dominant, followed by AP2-EREBP (565°), C2H2 (bHLH, 520°) and WRKY (432°) TF series (Table S6-7). Through KEGG classification analysis of all transcription factors, 26 transcription factors involved in flavonoid metabolism were found, which were annotated into 9 TF families. The largest number of genes was concentrated in the MYB family. Cluster analysis of expression levels showed that in the expression level is higher in roots and leaves. (Fig. 8B). Notably, the above-mentioned TF families were most abundant in L and R. The results of this experiment showed that TFs involved in flavonoid biosynthesis were expressed at significantly different levels in different tissues of O. aristatus. (Fig. S5) The roots had the highest expression levels of these TFs, suggesting that they are the main site of flavonoid biosynthesis. (Fig. 8C-D) The higher expression levels of TFs involved in nitrogen metabolism and oxidative stress response in the roots suggest that these TFs play an important role in the regulation of flavonoid biosynthesis in response to environmental stresses.

Fig. 8
figure 8

(A) Quantitative comparative Analysis of DEGs and TFs in different tissues. (B) Radar chart depicting pairwise comparisons of TFs across different tissues. (C) Classification of transcription factors involved in flavonoid biosynthesis. (D) The expression levels of each transcription factor involved in flavonoid biosynthesis.

Validation of DEGs via qRT-PCR analysis

To validate the accuracy of the kidney tea RNA-Seq analysis data, qRT-PCR analysis was conducted on 8 selected Unigenes encoding key enzymes involved in flavonoid biosynthesis (Fig. 9, Table S8-9). The validation was carried out for all of these genes. The results showed that the qRT-PCR expression levels of three Unigenes, namely CHI (CL6566.Contig3), CHS (CL12071.Contig1), and FLS (CL2421.Contig4), were consistent with the RNA-Seq data analysis results, albeit with a slight difference in expression fold. Furthermore, qRT-PCR results confirmed that the relative expression levels of 4CL, FLS and C12RT1 in leaf were higher than those in other tissues, while the expression levels of CYP73A and PAL in roots were significantly higher when compared to other tissues.

Fig. 9
figure 9

Results of PCR Enrichment for Key Genes (A-H) and their FPKM Trend Curves.

Discussion

O. aristatus, a medicinal plant with a long history, is cultivated for its dried leaves. This plant is known to contain antioxidants such as flavonoids and rosmarinic acid, with flavonoids being a key quality indicators31. These compounds exhibit both antioxidant and anti-inflammatory properties, which are beneficial in the treatment of chronic kidney disease32. The chemical composition of O. aristatus varies among its tissues33. Recent studies have revealed that the roots, previously considered non-medicinal, also possess significant medicinal potential25. In our study, we observed a differential distribution of flavonoid content across tissues, with roots exhibiting higher flavonoid abundance than stems and the highest accumulation of GQU (Fig. 1). These findings corroborate the robust antioxidant capacity of the roots, as reported in previous studies. Knowledge of this distribution pattern will help to understand the regulatory mechanisms of flavonoids in O. aristatus.

The transcriptome, which represents the complete set of RNA transcripts in a cell at a specific physiological state or developmental stage, has emerged as a powerful tool for identifying differentially expressed genes (DEGs) and unraveling the molecular mechanisms underlying these expressions34. Numerous studies have demonstrated that flavonoid content is significantly influenced by environmental factors such as light and temperature, as well as by growth conditions and water availability35,36. Additionally, the flavonoid content in plants is modulated by biosynthesis, degradation, reductive recovery from oxidized forms, and transport processes37. In this study, we aimed to elucidate the molecular mechanisms governing the spatial accumulation of flavonoids in O. aristatus. To this end, we sequenced the transcriptomes of three distinct tissues of O. aristatus, focusing on structural genes encoding enzymes and regulatory transcription factors that are pivotal in the regulation of flavonoid biosynthesis. Our analysis revealed a total of 5,897 DEGs associated with metabolic pathways that were common across all three tissues of O. aristatus. (Fig. 5).

Flavonoid synthesis is regulated in at least two ways: (1) at the level of flavonoid scaffold biosynthesis and (2) at the level of post-modification of flavonoid compounds38. Previous studies have suggested that in the flavonoid backbone biosynthesis pathway, the rate of metabolic flux is controlled by CHI and FLS39. Specifically, CHI catalyzes the conversion of chalcone to flavanone40, and FLS directs the synthesis of flavonol compounds41. Furthermore, C12RT1 is crucial for flavonoid glucosylation in post-modification processes42. In this study, we identified 14 unigenes encoding CHI in the kidney tea transcriptome, 13 were found in roots, 10 of which were highly expressed in roots, and 7 unigenes encoding FLS, 2 of which were highly expressed in roots. In addition, 3 of 5 unigenes encoding C12RT1 were abundant in root. This is consistent with the previous research41, suggesting that the genes involved in flavonoid synthesis may account for the observed differences in accumulation levels across various tissues of O. aristatus.

Transcription factors (TFs), usually DNA-binding proteins, employ a variety of mechanisms to regulate gene expression by responding to changes in environmental conditions, developmental processes, stress responses, and in defense responses against plant pathogens, and regulation of flavonoid biosynthesis is regulated by single or multiple transcription factor complexes such as TFs including MYB, bHLH, bZIP, and WD40 classes43. Zhai et al. overexpression of FtMYB6 in Tartary buckwheat hairy roots significantly increased the accumulation of flavonols by activating FtF3H and FtFLS1 expression44, and Gharari et al.. induced the overexpression of MYB7 and FNSП2 genes, which enhanced the production of three flavonoid compounds in hairy root cultures of Scutellaria bornmuelleri45. MYB and bHLH form complexes to regulate diverse metabolic and developmental processes in plants46,47. Li et al. found that co-expression of AcB2 (AcB2 belongs to the TT8 branch of the bHLH IIIf subgroup) with AcMYB1 in Arabidopsis thaliana protoplasts dramatically increased the expression of AcANS and AcF3H1 compared with that under the expression of AcMYB1 alone47. And similar to previous studies45,47, our results indicate that MYB and bHLH are more abundant in root tissues than in other tissues, suggesting that TFs involved in flavonoid synthesis may also account for the observed differences in accumulation levels across various tissues of O. aristatus.

Conclusions

In this study, we performed transcriptome sequencing of the O. aristatus root, leaf and stem tissues using the Illumina sequencing platform. This generated a substantial number of Unigenes, which provided a solid genetic basis for the identification of key genes and secondary metabolic pathways in this species. We focused on identifying the genes and Differentially Expressed Genes (DEGs) involved in the flavonoid biosynthesis pathway. Notably, we observed that high expression levels of certain genes, including CHI, C12RT1, F3H, and FLS, may be the primary contributors to the differences in flavonoid content between kidney tea root and other tissues. To further validate our findings, we performed qRT-PCR analysis, confirming the expression patterns of key enzyme genes related to flavonoid biosynthesis. In conclusion, our study identifies DEGs in root tissues that are crucial for flavonoid biosynthesis, potentially enhancing our understanding of the molecular basis of this pathway in kidney tea and contributing to the broader knowledge of plant secondary metabolism.

Methods

Plant and process

The materials (roots, leaves, and stems) of O. aristatus were collected from Xindongtou Village, Yangchun City, Guangdong Province (2194’88.84” N, 11210’9.65” E, 460 m), and identified by Professor Bangxing Han (West Anhui University). Voucher specimens were deposited in natural Medicinal Chemistry Laboratory in West Anhui University (Voucher number: 22CY0015). The plant samples were washed with ultra-pure water, and the leaf, root and stem tissues were separated. Half of the seedlings were quickly frozen in liquid nitrogen and stored in a −80℃ freezer. The other half was dried to constant weight at 55℃ for concentrations of rosmarinic acid and three flavonoid compounds.

The measurement of the content of four flavonoids

For ultrasonic extraction, we precisely weighed a 2 g sample of O. aristatus powder and combined it with 25 mL of 75% ethanol (v/v) in a 25 mL volumetric flask. The mixture underwent ultra sound treatment for 120 min and was subsequently allowed to cool to ambient temperature. Post-cooling, the solution was filtered through a 0.22 μm Millipore filter to reach the required volume for High-Performance Liquid Chromatography (HPLC) analysis. To assure accuracy, this procedure was repeated thrice. The HPLC analysis utilized an Agilent 1100 system by Agilent Technologies, based in Palo Alto, CA. We equipped the system with a Sepax RP-C18 column (Amethyst C18-H, 4.6 mm × 250 mm, with a particle size of 5 μm, Sepax Technologies, Suzhou, China) maintained at a temperature of 30 °C. The mobile phase composition was 0.1% acetic acid (v/v) in water mixed with methanol. UV detection occurred at a wavelength of 360 nm. The injection volume was 10 µL and the flow rate was 1 mL/min. The precision, accuracy, specificity, systematic suitability, peak purity, and robustness of HPLC methods were verified using the procedures previously studied. The content of four flavonoids was calculated by 5’-glucose-quercetin (GQU), Eupatorin (EUP), 3’-hydroxy-5,6,7,4’-tetramethoxyflavone (TMF), sinensetin (SEN) and physical (mg g-1) standard diagram. Univariate analysis of variance was performed using GraphPad Prism 9 (www.graphpad.com) software to detect the contents of rorosemary acid and four flavonoid compounds in three different tissues.

RNA extraction, cDNA library preparation and RNA-seq analysis

Total RNA was extracted from each tissue according to the instructions of plant RNA Kit (Agilent RNA 6000 Nano Kit), and the concentration, purity and integrity of RNA were detected by Agilent 2100 and NanoDrop, respectively. The total RNA that met the sequencing requirements was detected and the cDNA library was constructed according to the MAKSST48 method. Double-ended cDNA sequencing was performed on Illumina NovaseqTM 6000 (LC sciences, USA) according to the protocol recommended by our supplier, Wuhan Huada Company.

De novo assembly, unigenes annotation and functional classification

Raw data is filtered through SOAPnuke49(v1.4.0) to get Clean reads. Trinity50(v2.0.6) was then used for de novo assembly and concatenation, and Tgicl51software was used to cluster the assembled transcripts to obtain Unigenes. NCBI Blast (v2.2.23) was used to compare unigenes with nucleic acid sequence database (NT) and protein database (NR, KEGG, KOG, SwissProt, Pfam), and Blast2GO (v2.5.0)52 was used to match the GO annotation information of related unigenes.

Differentially expressed unigenes analysis

Differential expression analysis was used to determine the gene expression of target processes in different tissues. The default parameters of Bowtie2 (v1.2.8)53and RSEM54(v2.2.5) were used to estimate the level of gene expression in each sample. Within-group differential gene analysis was performed using DESeq55 under the conditions of Fold Change ≥ 2 and Adjusted Pvalue ≤ 0.001. Using two different organs of ClusterVenn compare the Venn diagram of deg. According to the GO and KEGG annotation results and official classifications, the differentially expressed genes were functionally classified, and the phyper (https://en.wikipedia.org/wiki/Hypergeometric_distribution) in R software was used for KEGG enrichment analysis, and the TermFinder package was used for GO Enrichment analysis (https://metacpan.org/pod/GO::TermFinder). With a Q value of ≤ 0.05 as the threshold, candidate genes that met this condition was defined as being significantly enriched.

Analysis of transcription factor (TF)

The respective ORF of unigenes in the transcriptome of O. aristatus root, leaf, and stem tissues was detected by Getorf software (EMBOSS:6.5.7.0)56. Each was then mapped to the plant TF database (PlantTFDB) by Hmmsearch (v3.0)57 using BLASTX (E-value ≤ 1e − 5).

qRT-PCR Analysis

Total RNA was extracted the three tissue and then transcribed into cDNA by using a PrimeScript Reverse Transcriptase kit (Takara, Japan). qRT-PCR analysis was performed using the PerfectStart® Green qPCR SuperMix (TransGen Biotech, Beijing, China) in ABI QuantStudio 1 fluorescence quantitative PCR instrument (Thermo Fisher Scientific, Waltham, MA, USA). Genes corresponding to some key enzymes involved in flavonoid biosynthesis of O. aristatus were selected, and the sequencing data were verified by qRT-PCR, with β-tubulin as the internal reference gene58.Primer Premier(v5.0) was used to design primers (Table S10).The experiment was repeated three times.