Abstract
TCP (TEOSINTEB RANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR) is a plant-specific transcription factor family that is closely associated with plant growth and development. To date, a few studies have been conducted on the TCP family of soybean. To investigate the function of the TCP transcription factor family in soybeans, we conducted a series of experiments and obtained significant results. Our detailed research on GmTCP670 revealed that its full length was 1128 bp encoding 375 amino acids, with no introns or untranslated regions. ExPASy prediction showed that the isoelectric point (pI) of GmTCP670 was 8.14, and its molecular weight (Mw) was 42.49 kDa. The flowering time of the GmTCP670 overexpression transgenic line was delayed compared with that of the control Dongnong 50 (DN50) and the CRISPR/Cas9 edited line (GmTCP670-CC). The leaves of the transgenic plants were larger, and the leaf type changed from pointed leaves of the receptor pro-Dongnong 50 to rounded leaves. The plants were taller than both their parental lines and the GmTCP670-CC lines, with maturity occurring 13 days later. The total amino acid, total essential amino acid, and arginine contents in the GmTCP670-CC transgenic line obtained through gene editing were higher than those in the DN50 and GmTCP670-OE lines. The GmTCP670 transcription factor recognizes GGACCC elements, binds to them, and activates the reporter gene expression. In addition, the GmTCP670 transcription factor interacted with the protein encoded by Glyma.18G297100 and activated reporter gene expression. This study provides basic theoretical data for future studies on the regulation of GmTCP670 transcription factors in soybean development and improvement, as well as on the underlying molecular mechanisms.
Similar content being viewed by others
Introduction
Transcription factors are proteins that typically possess DNA-binding domains, transcription regulatory domains, oligomerization sites, and nuclear localization signals1,2. To date, several transcription factors have been identified. TCP (TEOSINTE BRANCHED1/CYCLOIDEA/PROLIFERATING CELL FACTOR) transcription factor family was named based on the first letter of three genes: TEOSINTE BRANCHED1 (TB1) from maize, CYCLOIDEA (CYC) from snapdragon, and PROLIFE RATING CELLFACTORS1 and 2(PCF1 and PCF2) from rice3,4. It belongs to a family of plant-specific transcription factors that contain conserved domains and encode proteins that are structurally similar5,6. TCP transcription factors are key regulators of plant growth, and development and simultaneously regulate the expression of other genes7,8,9,10,11,12.
TCP transcription factors are involved in various biological processes in plants. Several recent studies have indicated that TCP genes play a role in the abiotic stress response13,14,15. Overexpression of OsTCP19 enhances plant tolerance to NaCl, whereas overexpression of OsTCP14 increases plant sensitivity to cold stress16,17. StTCP-silenced potato plants exhibit increased sensitivity to disease, suggesting that the TCP transcription factor family may play an important role in plant disease defense18. TCP transcription factors in grapes are associated with fruit development, and their expression can be inhibited by drought and water stress19. In Citrus, the expression levels of seven TCP transcription factors are significantly altered in response to drought stress20. TCP transcription factors in cassava are involved in distinct signaling pathways that contribute to the response to cold and drought stress21. Twenty-three CnTCP genes have been identified in Chrysanthemum nankingense, three of which exhibit a rapid response to cold stress22. In addition, cis-acting element analysis has shown that the PavTCP family in cherry has many light-responsive elements, and after shading treatment, the fruit color is significantly lighter, indicating that TCP transcription factors are related to shading stress23. These results suggest that the TCP transcription factor family is closely associated with plant stress responses.
Soybean originated in China and is cultivated globally. Soybean is an important and valuable protein source for humans24. In soybean, TCP transcription factors play an important role in growth and development25,26. GmTCP and GmNLP regulate soybean root development and nodulation under various nitrogen concentrations27. GmTCP40 can bind to the GmAP1a promoter and activate its transcription, thereby promoting soybean flowering under long-day conditions28. In addition, the TCP transcription factor genes GmFT2a and GmFT5a play a role in floral transition, whereas GmFT1a and GmFT4 suppress soybean flowering. Furthermore, GmFT7 is strongly expressed during floral transition in soybean29.
Several studies have demonstrated that TCP genes play crucial roles in the response of soybean to stress. The expression profile results indicated that GmTCPs respond to various abiotic stresses, including drought, salt, and heat, as well as to hormone treatments, such as ABA, BR, SA, and MeJA30. However, little is known about the functions and types of the TCP family in soybean.
GmTCP670 is a candidate gene related to alpha-subunit deletion, identified within the fine localization interval of the sensitized protein alpha-subunit deletion gene using the map-cloning method. This gene was the only one among the 14 candidate genes annotated as a TCP family transcription factor, identified by the serial number Glyma20g28670 (hereafter referred to as GmTCP670). Studies have shown that the TCP transcription factor gene GmTCP670 is a 7 S α-subunit deletion-related gene that may be involved in the recombination and quality improvement of hypoallergenic soybean protein caused by α-subunit deletion of the sensitized protein.
The epigenetic regulation of methylation modifications is a prominent area of research at the forefront of life science. Here, we focused on the GmTCP670 gene to verify its function and analyze its molecular mechanisms. We aimed to clarify the role of the soybean TCP family transcription factor GmTCP670 gene in subunit loss and the improvement of protein nutritional quality in ‘alpha-subunit deletion hypoallergenic soybean,’ as well as to elucidate the molecular mechanisms underlying its function. Using the soybean variety Dongnong50 (DN50), GmTCP670-OE lines, and GmTCP670-CC line populations as materials, a high-throughput epigenetic analysis (whole-genome bisulfite sequencing [WGBS]) and transcriptome analysis (RNA sequencing [RNA-seq]) was performed. In addition, a transcriptome-epigenetic association analysis of differentially expressed genes (DEGs) was performed and the relationship between epigenetic modifications and regulation of GmTCP670 gene expression was verified. This study aimed to provide a multi-omics, comprehensive, and systematic theoretical basis for analyzing the molecular mechanisms underlying improvements in hypoallergenic soybean protein quality.
Results
Bioinformatics analysis of GmTCP670
Phytozome analysis revealed that the full length of GmTCP670 was 1128-bp, with no introns or untranslated regions encoding 375 amino acids. ExPASY predicted that the pI of GmTCP670 is 8.14, and its molecular weight is 42.49 kDa. The secondary and tertiary structures of GmTCP670 are shown in Fig. 1A and B, respectively. The primary elements of the secondary structure of GmTCP670 include α-helices and random coils. Furthermore, analysis using the TMHMM Server v. 2.0 indicated that the protein lacked a transmembrane domain, classifying it as a non-transmembrane protein (Fig. 1C). ProtScale analysis suggested that the protein was hydrophilic (Fig. 1D), and SignalP 5.0 analysis revealed no potential signal peptide sequence, suggesting that the protein was non-secretory (Fig. 1E). NetPhos 3.1 analysis suggested that the protein might be phosphorylated (Fig. 1F), and ePlant analysis of expression patterns indicated that GmTCP670 had the highest expression level in flowers (Fig. 1G). Moreover, subcellular localization analysis suggested that GmTCP670 might be localized in the nucleus. The experimental results showed that the fluorescence of the control GFP protein was distributed throughout the cell, whereas that of the GmTCP670-GFP fusion protein was exclusively located in the nucleus (Figure S1), indicating that the GmTCP670 protein was localized in the nucleus.
A phylogenetic tree was constructed using full-length amino acid sequences of the TCP transcription factor family from Arabidopsis and soybean (Fig. 2). TCP families were categorized into three groups: two groups for Class I TCP genes and one group for Class II CYC/TB1-type TCP genes. According to the markers in the phylogenetic tree, GmTCP670 is classified as a Class II CYC/TB1-type TCP gene. GmTCP670 contains an ‘R domain’ that is associated with arginine enrichment. Therefore, we speculated that GmTCP670 might be related to arginine metabolism.
Bioinformatics analysis of GmTCP670. (A) Secondary structure of GmTCP670. (B) Tertiary structure of GmTCP670. (C) Prediction of transmembrane structural domains of GmTCP670. (D) Prediction of the hydrophobic regions of GmTCP670. (E) Prediction of signal peptide of GmTCP670. (F) Prediction of the phosphorylation sites of GmTCP670. (G) Prediction of expression patterns of GmTCP670.
Identification of transgene-positive plants
To investigate whether GmTCP670 plays a role in regulating plant development, the CDS of GmTCP670 was placed under the control of the cauliflower mosaic virus CaMV 35 S promoter. This construct was then introduced into soybean to generate stable GmTCP670-OE transgenic soybean lines (Fig. 3A). We obtained five independent homozygous GmTCP670-OE lines: OE14, OE15, OE19, OE21, and OE22 (Fig. 3B and C). OE14, OE15, and OE19 plants showed a round leaf phenotype, whereas OE21 and OE22 plants showed a phenotype similar to that of wild-type (WT) plants. However, no differences were observed in seed storage proteins among the lines (Fig. 3D). Thus, OE14, OE15, and OE19 lines were used for further studies.
We used CRISPR/Cas9-mediated genome editing tools to knock out the GmTCP670 gene in soybeans. Two guide RNAs were designed to specifically target GmTCP670 (Fig. 4A). The genomic locus of GmTCP670 containing the target sites was amplified by PCR using specific primers (Table S1). Sequencing analysis revealed that six independent T0 lines exhibited CRISPR/Cas9-mediated fragment deletions, and the results of PCR detection for the bar and Cas9 genes were positive (Fig. 4B, C). The subunit components of the seeds from DN47, DN50, and T3 seeds from GmTCP670-CC transgenic plants were visualized using SDS-PAGE (Fig. 4D). However, there were no significant changes in the subunit composition of the GmTCP670-CC seeds. We did not observe significant differences in plant growth and development between the five homozygous mutants and WT plants. The homozygous mutants CC01, CC02, and CC04, which had a deleted fragment length of 388 bp, were designated as GmTCP670-CC lines and were used for further analysis.
Construction and analysis of GmTCP670-OE lines. (A) Diagram of the recombinant expression vector, 35 S-GmTCP670 construct. LB, left T-DNA border; RB, right T-DNA border; 35 S, cauliflower mosaic virus 35 S promoter; Tnos, terminator nopaline synthase. (B) PCR detection of transgenic T0 soybean plant bar gene. M, DNA Marker DL 2000; +, 35 S-GmTCP670 recombinant plasmid; −, WT DN50 soybean plant lines; 1−5, transgenic plants. (C) LibertyLink strip analysis of the T0 transgenic plants. 1–5, transgenic plants. (D) SDS-PAGE analysis of protein extracts from mature T3 seeds harvested from GmTCP670-OE positive plants. Original gels are presented in Figs. S2 and S3.
Target site in the GmTCP670 gene and results obtained from mutagenesis of GmTCP670 using CRISPR/Cas9 technology. (A) Gene structure of GmTCP670 at the two target sites. (B) Detailed sequences of the target site of GmTCP670 in T0 lines. The black and red dashed lines represent DNA fragment omission and deletion, respectively; d, deletion; s, substitution. (C) PCR detection of transgenic T0 soybean plant bar gene and cas9 gene. M, DNA Marker DL 2000; +, pCBSG015-GmTCP670 recombinant plasmid; −, WT DN50 soybean plant lines; 1–6, transgenic plants. (D) SDS-PAGE analysis of protein extracts from mature T3 seeds harvested from GmTCP670-CC plants. Original gels are presented in Figs. S4 and S5.
Assessment of the agronomic traits of Transgenic plant GmTCP670-CC and GmTCP670-OE lines
The agronomic traits of transgenic plant lines GmTCP670-CC and GmTCP670-OE are shown in Fig. 5. During the same growth period, the GmTCP670-OE lines were taller than the DN50 and GmTCP670-CC lines (Fig. 5A). Although the DN50 and GmTCP670-CC lines had already progressed to the flowering stage, the GmTCP670-OE lines had not yet bloomed (Fig. 5B). This significant delay in flowering time for GmTCP670-OE resulted in a significant delay in maturity, approximately 13 days later than that of the other lines. Additionally, the leaf morphology of GmTCP670-OE transitioned from the sharp leaf type of WT to the round leaf type, with a leaf area significantly larger than those of DN50 and GmTCP670-CC. No significant differences were observed between DN50 and GmTCP670-CC (Fig. 5C). Furthermore, there were no significant differences in plant height, pod number per plant, or 100-seed weight between GmTCP670-CC and DN50. However, GmTCP670-OE showed significantly greater plant height and 100-seed weight than DN50, whereas the number of grains per plant was significantly lower than that of DN50 (Fig. 5D). In addition, the protein content in the mutant lines GmTCP670-CC and GmTCP670-OE was significantly lower than that in DN50, whereas the oil content was significantly higher than that in DN50 (Fig. 5E).
Assessment of agronomic traits of knockout (GmTCP670-CC) and overexpressing (GmTCP670-OE) lines. (A) Plants during the vegetative growth stage. (B) Comparison of flowering and maturity status between GmTCP670-CC and GmTCP670-OE plants. (C) Representative pictures showing trifoliate leaf shapes and comparison of leaf areas. Bar = 5 cm. (D) Plant height, pod number per plant, seed number per plant, and 100-seeds weight. (E) Seed protein and oil contents of T4-generation GmTCP670-CC, GmTCP670-OE, and DN50 plants grown under field conditions. Significant differences are indicated by different letters following Duncan’s multiple range test (P < 0.05).
Overexpressed and knockout of GmTCP670 affect the amino acid content and quality of soybean seeds
The free and total amino acid contents of mature T4 seeds from the mutant GmTCP670-CC and GmTCP670-OE were evaluated (Table 1). Except for methionine (Met), which was present at a lower level than in DN50, the concentration of other amino acids in GmTCP670-CC was higher than that in both DN50 and GmTCP670-OE, with a significant difference (P < 0.05) compared with GmTCP670-OE. Specifically, methionine content in GmTCP670-CC was 2.27% lower than that in DN50, whereas cysteine (Cys) was 7.84% higher than that in DN50. This resulted in a sulfur-containing amino acid (Met and Cys) content that was 3.16% higher than that in DN50 and 16.67% higher than that in GmTCP670-OE.
The free essential amino acid content of GmTCP670-CC was significantly lower than that of DN50 but significantly higher than that of GmTCP670-OE. The total free amino acids in GmTCP670-CC were significantly higher than those in DN50 and GmTCP670-OE (P < 0.05), by 18.55% and 83.95%, respectively. The levels of free methionine, alanine, and arginine in GmTCP670-CC were significantly higher than those in DN50 and GmTCP670-OE. The free methionine content was 1.74-fold higher than that of DN50 and 2.26-fold higher than that of GmTCP670-OE, and free arginine content was 43.15% and 211.33% higher than that of DN50 and GmTCP670-OE, respectively. The contents of free phenylalanine, aspartic acid, and glycine in GmTCP670-CC were similar to those in DN50 but significantly higher than those in GmTCP670-OE (P < 0.05).
Transcriptome comparative analysis of overexpression and knockout of GmTCP670
Comparative analysis of differentially expressed genes (DEGs, q-value < 0.05, | log2 foldchange | > 1) in the transcriptomes of GmTCP670-OE, GmTCP670-CC, and DN50 is presented in Fig. 6A. A total of 196 DEGs were identified between GmTCP670-CC and DN50, comprising 45 upregulated and 151 downregulated genes, respectively. Between GmTCP670-CC and GmTCP670-OE, 2317 DEGs were found with 1163 upregulated and 1154 downregulated genes. Additionally, there were 7514 DEGs between GmTCP670-OE and DN50, including 4250 upregulated and 3264 downregulated genes. To validate the RNA-seq results, qRT-PCR was used to select three significantly upregulated and three significantly downregulated genes in GmTCP-CC and GmTCP-OE for mRNA quantification (Figure S6). These results were consistent with those obtained from RNA-seq.
There were 24 common DEGs identified between GmTCP670-CC vs. DN50 and GmTCP670-CC vs. GmTCP670-OE, 149 common DEGs between GmTCP670-OE vs. DN50 and GmTCP670-CC vs. DN50, and 1997 common DEGs between GmTCP670-CC vs. GmTCP670-OE and GmTCP670-OE vs. DN50 (Fig. 6B). Among the three comparison groups, six common DEGs were distributed on different chromosomes: Glyma.01G142400, Glyma.07G163300, Glyma.09G027000, Glyma.13G289100, Glyma.15G244200, and Glyma.16G142200 (Fig. 6C). The enriched GO terms and KEGG pathways were also identified. The top 10 GO entries, along with the number of differentially expressed genes in each category, were plotted and analyzed in a bar chart (Figure S7). KEGG pathway analysis indicated that several important pathways, such as “Vitamin B6 metabolism,” “Biosynthesis of flavonoids and flavonols,” and “Linoleic acid metabolism” were enriched (Figure S8).
Association analysis of transcriptome and methylation of GmTCP670-CC, GmTCP670-OE, and DN50
According to the results of the association analysis of the transcriptome and methylation association analysis, genes related to growth and development, amino acid metabolism, and Arg metabolism were screened for differential expression in the transcriptome and simultaneous methylation modification changes, as shown in Table 2.
In the group GmTCP670-CC-vs-DN50, two genes, Glyma.01G178800 and Glyma.14G010300, were identified and annotated to be involved in amino acid transport. This indicated that the amino acid transport genes exhibited higher activity, resulting in elevated levels of certain amino acids in GmTCP670-CC. In group GmTCP670-OE vs. DN50, four genes were identified related to growth and development. Among these, three genes were associated with leaf development (Glyma.10G067200, Glyma.11G008500, and Glyma.19G192700), whereas one gene was related to growth and transcriptional regulation (Glyma.11G110700). Additionally, four genes were related to amino acid metabolism, including two associated with the vacuolar amino acid transporter (Glyma.01G178800 and Glyma.20G181900) and two associated with the amino acid hydrolase ILR1 like 4 (Glyma.04G247600 and Glyma.18G132000). Furthermore, three genes were associated with arginine, annotated as having arginine transmembrane transporter activity (Glyma.03G083600), arginine N-methyltransferase activity (Glyma.09G000300), and bifunctional arginine biosynthesis protein activity (Glyma.10G044300).
In the comparison between GmTCP670-CC vs. GmTCP670-OE, three genes associated with growth and development was identified. This included two genes involved in growth regulation and transcriptional regulation (Glyma.01G234400, Glyma.11G008500), and one gene involved in growth regulation and root development (Glyma.17G232600). Additionally, four genes related to amino acid metabolism were identified, comprising one gene associated with the transport of neutral and acidic amino acids (Glyma.04G209100) and three genes involved in amino acid transmembrane transport protein activity (Glyma.06G088200, Glyma.06G090200 and Glyma.09G118900). Furthermore, three genes related to arginine were identified, which coexisted in the comparison between GmTCP670-OE and DN50.
Interaction between GmTCP670 transcription factor and GGACCC element
The TCP domain region frequently binds to specific DNA regulatory elements, suggesting that TCP factors function as part of a regulatory complex to regulate gene expression. To verify the interaction between GmTCP670 and GGNCCC elements, we used a yeast single hybrid system. The GmTCP670 transcription factor gene was cloned into the yeast vector, pGADT7 (Fig. 7A). Three repetitive sequences containing the GGNCCC core sequence were constructed upstream of the Pmin promoter and ligated to the pHIS2 vector (Fig. 7B). The pHIS2 plasmid, which contains the target element GGNCCC, and the pGADT7 plasmid, associated with the target gene GmTCP670, were co-transformed into Y187 yeast. The transformed yeast cells were cultured on SD/-His-Trp and SD/-His-Trp-Leu media supplemented with varying concentrations of 3-AT. However, the co-transformed strains (pHIS2-GGNCCC + pGADT7 and pHIS2 + pGADT7-GmTCP670) failed to grow on SD/-His/-Trp/-Leu medium supplemented with 300 mM/L 3AT.
Therefore, the SD/-His/-Trp/-Leu medium supplemented with 300 mM/L 3AT was used to further investigate the interaction between GmTCP670 and the element. As shown in Fig. 7C, the yeast strains co-expressing pHIS2-GGNCCC and pGADT7-GmTCP670, along with the positive control pHIS2-P53 + pGAD53m, exhibited normal growth on SD/-His/-Trp/-Leu + 300 mmol/L 3AT. However, the growth of the control groups pHIS2-GGNCCC + pGADT7 and pHIS2 + pGADT7-GmTCP670 was inhibited in SD/-His/-Trp/-Leu + 300 mM/L 3AT medium. Notably, the GmTCP670 TF interacts with the GGNCCC element. We confirmed that the GGACCC element could bind to the GmTCP670 transcription factor.
Y2H screening of GmTCP670 interacting proteins
Y2H analysis was conducted to determine whether GmTCP670 exhibited transcriptional activation activity in yeast cells using expression constructs and reporter constructs (Fig. 9A). These results demonstrated that both full-length GmTCP670 and GmTCP670-II activated transcription in yeast. In contrast, GmTCP670-I (AAs 97–249) and GmTCP670-III (AAs 97–375) were unable to activate transcription (Fig. 9B). Consequently, GmTCP670-III was used to screen the libraries.
Fifty-eight blue colonies were characterized by sequence analysis using BLAST. Sixteen candidate genes encoding proteins that might interact with GmTCP670 are listed in Fig. 8. Homology analysis revealed that these candidate proteins were associated with seed storage proteins, domain-containing proteins, and phosphate dehydrogenase.
Interaction of GmTCP670 with GmLOB40 in yeast
To confirm which protein interacts with GmTCP670-III, we selected four fusion genes (Gy1, Gy4, Gy5, and LOB40) that were predicted to be related to the 11 S seed storage protein, and lateral organ boundary (LOB) domain-containing proteins were selected for further investigation. Full-length cDNAs of the four genes were cloned and constructed in a pGADT7 vector (Fig. 9C). We analyzed whether these proteins could activate transcription and interact with GmTCP670 in the yeast cells. Our results showed that none of the four proteins activated transcription in yeast cells; however, GmLOB40 interacted with GmTCP670-III (Fig. 9D), whereas the other three candidate proteins did not interact with GmTCP670-III in yeast.
Interaction between GmTCP670 and GmLOB40 in yeast cells. (A) Diagram of pGBKT7 + GmTCP670/GmTCP670-Ι/GmTCP670-II/GmTCP670-III bait vectors. (B) Full-length GmTCP670, GmTCP670-Ι, GmTCP670-II, GmTCP670-III transcription activation assays. (C) Reverse transcription polymerase chain reaction cloning of GmLOB40 (Glyma.18G297100), GmGy4 (Glyma.10G037100), GmGy5 (Glyma.13G123500), and GmGy1 (Glyma.03G163500) full-length cDNAs. Original gel is presented in Figure S9. (D) GmTCP670-III interacted with GmLOB40 in yeast cells.
Discussion
The TCP transcription factor family plays a vital role in plant development and stress responses31,32,33. Through systematic gene function verification analysis, this study demonstrated that the soybean GmTCP670 transcription factor influences soybean growth and development from multiple aspects. Overexpression of the GmTCP670 transcription factor significantly altered leaf type, plant height, flowering time, and growth period, all of which are closely related to soybean yield. This study provides valuable genetic resources for the molecular design breeding of high-quality, high-yield soybean varieties. The results of CRISPR/Case9 gene editing in this study revealed that knocking out large fragments of the GmTCP670 gene improved the amino acid quality of T4 seeds compared with the control DN50 and overexpressed transgenic lines. This indicates that the differential expression of the GmTCP670 gene affects the amino acid quality of soybean proteins. Thus, it was proven that the GmTCP670 transcription factor is an excellent genetic resource for enhancing the amino acid quality of soybean protein.
Epigenetic modifications of nucleic acids, such as DNA and RNA, particularly through processes for instance methylation, play a crucial role in regulating cellular processes across all kingdoms of life. Third-generation sequencing technologies have identified numerous previously unknown modifications in diverse organisms34,35. A comprehensive analysis of transcriptome and whole-genome methylation sequencing was conducted. The findings from different combinations of GmTCP670-OE, GmTCP670-CC, and the control DN50 were compared, and various combinations of transcriptome expression and methylation modifications were screened. Eighteen DEGs related to growth and development, amino acid synthesis, and arginine synthesis were identified, which are important for analyzing the molecular mechanism of GmTCP670 transcription factor expression at the epigenomic level. In particular, the gene Glyma.10G044300, which is associated with arginine synthesis and metabolism exhibited significant differential expression in previous studies α-subunit deletion hypoallergenic soybean mutant. This study further confirms the presence of methylation modifications. Methylation modification may be an important molecular means to improve the quality of hypoallergenic soybean proteins. Using the transcriptome and genome-wide high-throughput methylation data obtained in this study, we conducted a comprehensive and systematic analysis of the molecular mechanisms by which the GmTCP670 transcription factor regulates soybean development and quality improvement. This will significantly contribute to an overall understanding of the molecular mechanisms involved in enhancing the quality of hypoallergenic soybean proteins.
Several studies have demonstrated that the TCP transcription factor can directly bind to the promoters of many genes via the GGNCCC/GGGNCC motif36. In this study, we used a Y1H system to investigate the relationship between the GmTCP670 and GGNCCC elements. The N in the GGNCCC element could be C, G, A, or T; therefore, we used four elements (GGCCCC, GGGCCC, GGACCC, and GGTCCC) combined with GmTCP670 to confirm the element type that could be recognized and bound by GmTCP670. According to the Y1H results (Fig. 7C), GGACCC was directly bound by GmTCP670.
Protein-protein interactions are ubiquitous in every cellular process, including DNA replication, transcription, splicing, and translation. Identifying and characterizing protein interactions is essential for understanding these processes at both molecular and biophysical levels. Among the various techniques developed to study protein-protein interactions, the Y2H system has proven to be a powerful tool for investigating molecular interactions37. In this study, we screened for proteins that interact with the soybean GmTCP670 transcription factor using a Y2H system. We selected four proteins, Gy1 (Glyma.03G163500), Gy4 (Glyma.10G037100), Gy5 (Glyma.13G123500), and LOB40 (Glyma.18G297100), for interaction validation. Three genes, Gy1, Gy4, and Gy5, encode 11 S seed storage proteins. Glyma.18G297100 encodes an LOB domain-containing protein. LOB domain genes belong to a gene family that encodes plant-specific transcription factors that play crucial roles in plant growth and development38. LOB genes regulate leaf and root development in plants39. Seed storage proteins (SSPs) are essential for germination and early seedling growth. Glyma.18G297100 could interact with GmTCP670, suggesting that GmTCP670 affects soybean growth and development. These findings will contribute significantly to a comprehensive analysis of the molecular mechanisms underlying soybean improvement.
Materials and methods
Experimental materials
For this study, the soybean variety Dongnong50 (DN50), which served as both the transgenic receptor and control material, was provided by Inner Mongolia Agricultural University, China.
Bioinformatics analysis of GmTCP670
We used Phytozome (https://phytozome-next.jgi.doe.gov/) to analyze gene and protein sequences. The secondary and tertiary structures of proteins were predicted using SOPMA (https://npsa.lyon.inserm.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_sopma.html), and Swiss-Model (https://www.swissmodel.expasy.org/). We also used ExPASY (https://web.expasy.org/compute_pi/) for protein molecular-weight (Mw) determination, isoelectric point (pI) prediction, ProtScale (https://web.expasy.org/protscale/) for protein hydrophobicity analysis, TMHMM Server v 2.0 (https://services.healthtech.dtu.dk/service.php?TMHMM-2.0) for transmembrane structural domain prediction, SignalP 5.0, (https://services.healthtech.dtu.dk/services/SignalP-5.0/) for signal peptide prediction, and NetPhos 3.1 (https://services.healthtech.dtu.dk/services/NetPhos-3.1/) for phosphorylation site prediction, and ePlant (https://bar.utoronto.ca/eplant_soybean/) for expression pattern prediction. CELLO (http://cello.life.nctu.edu.tw/) was used to predict the subcellular localization of proteins. Phylogenetic analysis of the GmTCP670 tree for Glycine max and Arabidopsis thaliana was conducted using MEGA11.
Subcellular localization of GmTCP670 protein
The coding region of GmTCP670 was cloned into the pCAMBIA1302 vector to produce 35 S: GmTCP670::GFP, whereas the empty vector 35 S::GFP served as a control. Transient expression of green fluorescent protein (GFP)-fused proteins in Arabidopsis protoplasts. Transfected cells were observed under a confocal laser scanning microscope (Leica TCS SP2, Germany).
Generation of Transgenic plants
The full-length coding sequence (CDS) of GmTCP670 was amplified by polymerase chain reaction (PCR) from a complementary DNA (cDNA) library of the soybean cultivar Dongnong 47 (DN47). This sequence was identical to the CDS cloned from the Williams 82 cultivar and subsequently cloned into an overexpression vector. CRISPR/Cas9 gene knockout constructs were generated using the pCBSG015 (Basta) vector, which contains Cas9 protein. We designed two target sequences (5′-CCGCCATGCTTGAACACGATGCA-3′ and 5′- ATGGTGGTGATGCTTCCCGAGGG-3′) and combined them using the pCBSG015 (Basta) vector. The constructs were individually introduced into Agrobacterium tumefaciens strain EHA105 using the freeze-thaw method. The EHA105 recombinant strain was used to transform the DN50 cultivar background through soybean embryo cotyledonary node transformation, following a previously described method40.
Identification of transformants and mutations
The expression cassette containing the marker gene bar, which encodes phosphinothricin acetyltransferase (PAT), confers resistance to the herbicide ammonium glufosinate. To detect transgenic plants, the LibertyLink strip was used to measure PAT protein content according to the manufacturer’s instructions (Envirologix Inc., Portland, ME, USA). For transgenic plants, genomic DNA was extracted from the leaves using the cetyltrimethylammonium bromide (CTAB) method, the bar or cas9 gene was amplified to confirm T-DNA insertion and the target gene fragment was amplified using the following primers: 5ʹ-GCAAGCCACACCCTTCTTGA-3ʹ and 5ʹ-CCACCAAAGTTAACGGCCCC-3ʹ. The PCR products were then sequenced to verify gene editing, and only transformed plants with successful gene editing were used for subsequent experiments.
Analysis of subunit composition of soybean seed proteins
Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) was used to analyze the protein composition of the seeds. A small portion of dry seeds, carefully removed manually to avoid damaging the embryonic axis, was used as a soy flour sample. Total seed proteins were extracted from soy flour using SDS sample buffer (2% SDS, 5% 2-mercaptoethanol, 10% glycerol, 5 M urea, and 62.5 mM Tris-HCl), followed by centrifugation at 15,000 g. A 10 µL aliquot of the supernatant was separated via SDS-PAGE using 4.5% stacking and 12.5% separating polyacrylamide gels and stained with Coomassie Brilliant Blue R-250 dye. The gels were scanned using a SHARP JX-330 scanner (Amersham Biosciences, Canada).
Characterization of agronomic traits
T4 GmTCP670-OE lines, GmTCP670-CC lines, and soybean variety ‘DN50’ were grown in Inner Mongolia, China, under standard growth conditions. The experiment used a randomized complete block design, with three replicates. Seeds were sown in rows that were three meters long and 0.65 m apart, with a spacing of 6 cm between each plant. After reaching full maturity, the mature seeds were harvested from the plants and air-dried. In total, 10 DN50 plants and 10 individuals from each transgenic line were randomly sampled at the mature stage in each replicate for phenotypic analysis. The plant height, pod number, seeds number per plant, and 100-seed weight were measured. The experiments were performed in triplicates. The leaves were scanned using a flatbed scanner (CanoScan LiDE 200, Canon Inc., Japan) and the leaf area (cm2) was measured using ImageJ.
Analysis of total seed protein and oil contents
Seed protein content was determined using the Dumas method41 (Rapid N Cube, Elementar Analytical, France), using the formula N × 6.25, to convert total nitrogen into protein content. The seed oil content was analyzed using a near-infrared grain analyzer (Infratec 1241 Analyzer; Foss, Denmark). The total oil and protein contents were expressed as a percentage of the dry weight.
Amino acid analysis
Total amino acids were extracted by hydrolyzing the seed meal in 6 M hydrochloric acid (HCl) for 22 h in sealed evacuated tubes, which were then placed in boiling water at 110 °C. An L-8800 amino acid analyzer (Hitachi, Japan) was used to estimate the amino acid composition of the hydrolysate.
Free amino acids were extracted from 5.00 g of the seed meal. The seeds were sampled according to a sample quartile method, fully dried, ground using a mill grinder, filtered through a 0.25-mm sieve, and thoroughly mixed. They were then finely homogenized in 30 mL sulfosalicylic acid (10 g per 100 mL) and disrupted ultrasonically for 30 min. The samples were centrifuged at 5000 ×g for 5 min. The supernatant was filtered using a 22-µm GD/X sterile disposable syringe filter. An L-8800 amino acid analyzer (Hitachi) was used to analyze the filtrate. Amino acid concentration was calculated using the following formula: g/16 g N in the test protein divided by g/16 g N in the scoring pattern.
Sample Preparation for RNA-seq and WGBS
Soybean seeds were collected from the control line DN50, overexpressing lines GmTCP670-OE, and knockout lines GmTCP670-CC 50 days after flowering. The seeds were stored in liquid nitrogen, and three biological replicates were performed. Half of the samples from each group were treated with TRIzol reagent (Invitrogen, Carlsbad, CA, USA) to extract RNA for RNA-seq assay, whereas DNA was isolated from the remaining samples using a modified CTAB method for the WGBS assay42.
Transcriptome sequencing and data analyses
Nine seed samples were sent to OEbiotech Co., Ltd. (Shanghai, China) for sequencing and analysis. Sequencing libraries were generated using the Illumina HiSeq X Ten sequencing platform. After adapter clipping and quality filtering, the clean reads were aligned to the soybean DN50 reference transcriptome using HisHat43. Gene expression levels were evaluated in reads per kilobase per million reads (RPKM) based on the number of reads mapped to the reference sequence. Differential expression analysis was performed using DEG-seq44. The screening criteria for DEGs were set at q-value < 0.05, and | log2 foldchange | > 1.
WGBS sequencing and data analysis
WGBS was performed by OE Biotech Co. Ltd. (Shanghai, China). DNA was fragmented by sonication using a Bioruptor (Diagenode, Brussels, Belgium) to achieve a mean size of approximately 250 base pairs (bp). This was followed by the addition of dA to the 3ʹ-end through blunt-end cloning and ligation of a methylated adaptor. Ligated DNA was converted to bisulfite using the EZ DNA Methylation-Gold kit (ZYMO, Tustin, CA, USA). DNA fragments of varying lengths were excised from 2% agarose gel, purified, and amplified by PCR. Sequencing was performed using an Illumina HiSeq 4000 platform (Illumina, San Diego, CA, USA). The Bismark software (version 0.16.3) was used to align the reads to the reference genome. Fisher exact test was used to identify significant DMRs, with screening criteria set at methylation differences ≥ 10% and a Q value ≤ 0.05. After DMRs were identified, differentially methylated genes (DMGs) located in DMRs were characterized.
Yeast one-hybrid (Y1H) assay
Y1H assay was performed according to the procedure described by Clontech (Takara). The cDNA fragment of GmTCP670, was cloned into the pGADT7 vector using the double-enzyme digestion method. The GGNCCC fragment was inserted thrice into the bait vector (pHIS2) to create pHIS2-GGNCCC (3 × GGNCCC). The plasmids containing the transformed bait vectors (pHIS2) were transformed into the yeast strain Y187 and cultured on SD/-Trp/-Leu medium at 30 °C for three days. pHIS2-GGNCCC was used for transcriptional activation. pHIS2-P53 and pGAD53m were co-transfected into yeast cells as positive controls. Similarly, pHIS2 + pGADT7-GmTCP670 and pHIS2-GGNCCC + pGADT7 were co-transformed into yeast cells to serve as the experimental groups.
Single bacterial colonies were randomly selected from each transformation reaction plate and cultured on SD/-His/-Trp/-Leu with or without 3-amino-1,2,4-triazole (0, 30, 60, 100, 150, 200, 300, and 400 mM/L 3AT) at 30 °C for three days. The concentration of 3AT used for screening positive colonies was determined based on the self-activation detection concentration of 3AT in pHIS2-GGNCCC bait vectors. The pHIS2-GGNCCC + pGADT7-GmTCP670 sample was randomly selected and spotted on SD/-Trp/-Leu/-Ura with an appropriate concentration of 3AT medium for 3–5 days of cultivation.
Transcription activation assays
Full-length of GmTCP670 and three cDNA fragments (GmTCP670-I encoding AAs 97–249, GmTCP670-II encoding AAs 1–249, and GmTCP670-III encoding AAs 97–375) were amplified by PCR using appropriate primers (GmTCP670-F/R, GmTCP670-I-F/R, GmTCP670-II-F/R, and GmTCP670-III-F/R; see Supplementary Table 1). Transcription activation assays were conducted using the yeast strain Y2HGold (Clontech), which contains HIS3 and ADE2 reporter genes regulated by distinct GAL4-responsive promoter elements. Purified PCR products were cloned into the pGBKT7 vector. The fusion plasmids and pGADT7 vector were transformed into yeast strain Y2HGold. Yeast cells were selected by culturing on SD/-Trp-Leu and SD/-Trp-Leu-His-Ade media. The pGBKT7-53 and pGADT7-T plasmids were introduced into yeast Y2HGold cells as positive controls, whereas yeast cells containing the pGBKT7-Lam and pGADT7-T plasmids served as negative controls.
Yeast two-hybrid (Y2H) library assays
A cDNA library was constructed from α-subunit-deficient hypoallergenic soybean seeds at 18, 35, and 55 days after flowering using a Y2H library construction kit (Clontech). The screening for interacting proteins was performed according to the manufacturer’s instructions (Clontech). Transformants from the cDNA library were plated on SD/-Trp-Leu-His-Ade medium at 30 °C for 3–5 days. Yeast colonies were cultured on SD/-Trp-Leu-His-Ade medium containing X-a-Gal (20 µg/mL) and aureobasidin A (AbA; 125 mg/mL). Blue colonies were characterized using PCR and sequencing techniques.
Y2H experiments were performed as previously described45. Full-length cDNAs of these genes were amplified using PCR and subsequently cloned into the pGADT7 vector. Fusion plasmids and pGBKT7-GmTCP670-III were co-transformed into yeast strain Y2HGold. After selection at 30℃, yeast colonies grown on SD/-Trp-Leu medium were transferred to SD/-Trp-Leu-His-Ade or SD/-Trp-Leu-His-Ade/X-a-Gal/AbA media. Yeast Y2HGold cells containing pGBKT7-53 and pGADT7-T served as positive controls, whereas the co-expression of pGBKT7-lam and pGADT7-T was used as a negative control.
Statistical analysis
Data are presented as mean ± standard deviation. After conducting a one-way analysis of variance, a post-hoc Duncan’s multiple range test was performed to evaluate the differences among the group means using IBM SPSS Statistics software (version 22.0), with a significance level set at P < 0.05.
Data availability
Data are contained within the article and Supplementary Materials.
References
Yang, M. et al. Systematic analysis and expression profiles of TCP gene family in Tartary buckwheat (Fagopyrum Tataricum (L.) Gaertn.) revealed the potential function of FtTCP15 and FtTCP18 in response to abiotic stress. BMC Genom. 23, 415 (2022).
Riechmann, J. L. et al. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science. 290, 105–110 (2000).
Luo, D., Carpenter, R., Vincent, C., Copsey, L. & Coen, E. Origin of floral asymmetry in Antirrhinum. Nature. 383, 794–799 (1996).
Doebley, J., Stec, A. & Hubbard, L. The evolution of apical dominance in maize. Nature. 386, 485–488 (1997).
Cubas, P., Lauter, N., Doebley, J. & Coen, E. The TCP domain: a motif found in proteins regulating plant growth and development. Plant. J. 18, 215–222 (1999).
Kosugi, S. & Ohashi, Y. DNA binding and dimerization specificity and potential targets for the TCP protein family. Plant. J. 30, 337–348 (2002).
Viola, I. L. & Gonzalez, D. H. TCP transcription factors in plant reproductive development: juggling multiple roles. Biomolecules 13, 750 (2023).
Li, X. et al. TCP7 interacts with nuclear Factor-Ys to promote flowering by directly regulating SOC1 in Arabidopsis. Plant. J. 108, 1493–1506 (2021).
Lucero, L. E., Manavella, P. A., Gras, D. E., Ariel, F. D. & Gonzalez, D. H. Class I and class II TCP transcription factors modulate SOC1-dependent flowering at multiple levels. Mol. Plant. 10, 1571–1574 (2017).
Camoirano, A., Alem, A. L., Gonzalez, D. H. & Viola, I. L. The N-terminal region located upstream of the TCP domain is responsible for the antagonistic action of the Arabidopsis thaliana TCP8 and TCP23 transcription factors on flowering time. Plant. Sci. 328, 111571 (2023).
Liu, J. et al. MicroRNA319-regulated TCPs interact with FBHs and PFT1 to activate CO transcription and control flowering time in Arabidopsis. PLoS Genet. 13, e1006833 (2017).
Jin, K. et al. TCP transcription factors involved in shoot development of Ma bamboo (Dendrocalamus latiflorus Munro). Front. Plant. Sci. 13, 884443 (2022).
Viola, I. L., Camoirano, A. & Gonzalez, D. H. Redox-dependent modulation of anthocyanin biosynthesis by the TCP transcription factor TCP15 during exposure to high light intensity conditions in Arabidopsis. Plant. Physiol. 170, 74–85 (2016).
Danisman, S. TCP transcription factors at the interface between environmental challenges and the plant’s growth responses. Front. Plant. Sci. 7, 1930–1942 (2016).
Guan, P. et al. Interacting TCP and NLP transcription factors control plant responses to nitrate availability. Proc. Natl. Acad. Sci. U. S. A. 114, 2419–2424 (2017).
Mukhopadhyay, P. & Tyagi, A. K. OsTCP19 influences developmental and abiotic stress signaling by modulating ABI4-mediated pathways. Sci. Rep. 5, 9998–10008 (2015).
Wang, S. T. et al. MicroRNA319 positively regulates cold tolerance by targeting OsPCF6 and OsTCP21 in rice (Oryza sativa L). PLoS One. 9, e91357 (2014).
Bao, S., Zhang, Z., Lian, Q., Sun, Q. & Zhang, R. Evolution and expression of genes encoding TCP transcription factors in Solanum tuberosum reveal the involvement of StTCP23 in plant defence. BMC Genet. 20, 91 (2019).
Leng, X. et al. Genome-wide identification and transcript analysis of TCP transcription factors in grapevine. BMC Genom. 20, 786 (2019).
Liu, D. H. et al. Genome-wide analysis of citrus TCP transcription factors and their responses to abiotic stresses. BMC Plant. Biol. 22, 325 (2022).
Lei, N. et al. Phylogeny and expression pattern analysis of TCP transcription factors in cassava seedlings exposed to cold and/or drought stress. Sci. Rep. 7, 10016 (2017).
Tian, C. et al. Characterization of the TCP gene family in Chrysanthemum Nankingense and the role of CnTCP4 in cold tolerance. Plants (Basel). 11, 936 (2022).
Chen, C., Zhang, Y., Chen, Y., Chen, H. & Gong, R. Sweet Cherry TCP gene family analysis reveals potential functions of PavTCP1, PavTCP2 and PavTCP3 in fruit light responses. BMC Genom. 25, 3 (2024).
Liu, K. S. Soybeans: Chemistry, Technology, and Utilization (Springer, 2012).
Wu, J. et al. LWD-TCP complex activates the morning gene CCA1 in Arabidopsis. Nat. Commun. 7, 13181 (2016).
Xia, Z. et al. QNE1 is a key flowering regulator determining the length of the vegetative period in soybean cultivars. Sci. China Life Sci. 65, 2472–2490 (2022).
Kim, Y. et al. GmTCP and GmNLP underlying nodulation character in soybean depending on nitrogen. Int. J. Mol. Sci. 24, 7750 (2023).
Zhang, L. et al. GmTCP40 promotes soybean flowering under long-day conditions by binding to the GmAP1a promoter and upregulating its expression. Biomolecules 14, 465 (2024).
Zhang, S., Singh, M. B. & Bhalla, P. L. Molecular characterization of a soybean FT homologue, GmFT7. Sci. Rep. 11, 3651 (2021).
Feng, Z. J. et al. Soybean TCP transcription factors: evolution, classification, protein interaction and stress and hormone responsiveness. Plant. Physiol. Biochem. 127, 129–142 (2018).
Liu, D. K. et al. Genome-wide analysis of the TCP gene family and their expression pattern in Cymbidium goeringii. Front. Plant. Sci. 13, 1068969 (2022).
Li, H. et al. Genome-wide identification and characterization of TCP gene family members in Melastoma candidum. Molecules. 27, 9036 (2022).
Zou, Q. et al. Genome-wide analysis of TCP transcription factors and their expression pattern analysis of Rose plants (Rosa chinensis). Curr. Issues Mol. Biol. 45, 6352–6364 (2023).
Beaulaurier, J., Schadt, E. E. & Fang, G. Deciphering bacterial epigenomes using modern sequencing technologies. Nat. Rev. Genet. 20, 157–172 (2019).
Van Dongen, J. et al. Genome-wide analysis of DNA methylation in buccal cells: a study of monozygotic twins and mQTLs. Epigenet Chromatin. 11, 1–14 (2018).
Wang, X. et al. PpTCP18 is upregulated by lncRNA5 and controls branch number in Peach (Prunus persica) through positive feedback regulation of Strigolactone biosynthesis. Hortic. Res. 10, 224 (2022).
Ferro, E. & Trabalzini, L. The yeast two-hybrid and related methods as powerful tools to study plant cell signalling. Plant. Mol. Biol. 83, 287–301 (2013).
Zhang, Y., Li, Z., Ma, B., Hou, Q. & Wan, X. Phylogeny and functions of LOB domain proteins in plants. Int. J. Mol. Sci. 21, 2278 (2020).
Ma, Y., Wang, F. & Guo, J. Rice OsAS2 gene, a member of LOB domain family, functions in the regulation of shoot differentiation and leaf development. J. Plant. Biol. 52, 374–381 (2009).
Liu, M. et al. Transgenic expression of ThIPK2 gene in soybean improves stress tolerance, oleic acid content and seed size. Plant. Cell. Tiss Org. 111, 277–289 (2012).
Friedman, M., Noma, A. T. & Wagner, J. R. Ion-exchangechromatography of sulfur amino acids on a single-column amino acid analyzer. Anal. Biochem. 98, 293–304 (1979).
Murray, M. & Thompson, W. F. Rapid isolation of high molecular weight plant DNA. Nucleic Acids Res. 8, 4321–4326 (1980).
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods. 12, 357–360 (2015).
Anders, S. & Huber, W. Differential expression of RNA-Seq data at the gene level—the DESeq package. Heidelberg, Germany: European Molecular Biology Laboratory (EMBL). 10, f1000 (2012).
Wei, P. C. et al. Overexpression of AtDOF4.7, an Arabidopsis DOF family transcription factor, induces floral organ abscission deficiency in Arabidopsis. Plant. Physiol. 153, 1031–1045 (2010).
Funding
This study was supported by the National Natural Science Foundation of China (32060613).
Author information
Authors and Affiliations
Contributions
Q.W. conceived and designed the project. H.W. collected samples. Q.W. and R.L. performed the experiments. Q.W. and N.Z. analyzed the data. Q.W. wrote the manuscript. W.B. revised the manuscript. All authors have read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wu, Q., Wang, H., Lin, R. et al. Identification and characterization of TCP transcription factor GmTCP670 associated with soybean development. Sci Rep 15, 19707 (2025). https://doi.org/10.1038/s41598-025-04257-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-04257-0