Abstract
Co-transcriptional cleavage and transcription termination are closely related processes during mRNA maturation, yet their coordination remains poorly understood due to difficulties in detecting these transient events. Here, we applied single-molecule nascent RNA sequencing to simultaneously capture the cleavage status and readthrough distance on the same nascent RNA molecules and characterize 14 mutants of various pre-mRNA processing factors in Arabidopsis. Our results reveal diverse roles for these processing factors in coordinating cleavage and termination: core components of CPSF and CstF complex stimulate both cleavage and termination, facilitating access to exoribonuclease AtXRN3; mutations in nuclear poly(A) polymerase PAPS1 and AtXRN3 caused delayed termination with minimal effects on cleavage, suggesting their roles are further downstream; BORDER proteins facilitate termination while simultaneously inhibiting cleavage, suggesting a complex interplay between these two actions; the phosphatase SSU72 specifically promotes efficient termination without affecting cleavage. Our method also enables us to distinguish cleaved readthrough transcripts from full-length readthrough, and we found termination factor FPA specifically promotes termination of cleaved readthrough, suggesting FPA facilitates access of AtXRN3 to the 3’ cleavage product. Our comprehensive datasets reveal cleavage and termination are highly coordinated during pre-mRNA processing.
Similar content being viewed by others
Introduction
Transcription termination is the final stage of gene transcription, during which RNA polymerase II (Pol II) is released from the DNA template1,2,3. This process is closely linked to pre-mRNA 3’ end processing, specifically cleavage and polyadenylation (CPA), which ensures the integrity and functionality of mRNA3,4. The highly conserved CPA machinery interacts with elongating Pol II to recognize the polyadenylation signal (PAS) on the nascent RNA and to cleave the RNA4,5,6,7,8. Key components of this machinery include the CPA specificity factor (CPSF), cleavage stimulation factor (CstF), and cleavage Factors I and II (CFI and CFII)4,6,8. The CPSF complex contains three functional modules: the polyadenylation specificity module, the cleavage module, and the phosphatase module4,9. In the polyadenylation specificity module, FY (the WDR33 homolog10,11) and CPSF30 recognize the canonical AAUAAA PAS4,7,8, while FIP1 binds adjacent U-rich sequences to enhance polyadenylation6,12. Additionally, the Arabidopsis RNA-binding protein FCA regulates mRNA 3’ end processing by interacting with FY11,13. In the cleavage module, CPSF100 connects the endonuclease CPSF73 to the polyadenylation specificity module9,14,15. Co-transcriptional cleavage by CPSF73 produces two distinct products: the 5’ cleavage product awaiting polyadenylation by poly(A) polymerase, and the uncapped 3’ cleavage product targeted for degradation16,17. Arabidopsis encodes three canonical nuclear poly(A) polymerases: PAPS1, PAPS2, and PAPS4, with PAPS2 and PAPS4 exhibiting functionally redundant18,19,20. Meanwhile, the CstF complex assists CPSF, with CstF64 recognizing U-rich and GU-rich sequences downstream of the poly(A) site, and CstF77 connecting CstF64 to the CPSF complex4,6.
The recruitment and function of CPA factors are closely associated with the phosphorylation state of the Pol II carboxy-terminal domain (CTD), particularly the phosphorylation of Ser2 (Ser2-P) and Ser5 (Ser5-P)1,2,3,21,22,23. These modifications influence the recruitment of factors such as the CstF complex24,25 and PCFS4 (a PCF11 homolog26,27). These exchange of pre-mRNA processing factors or conformational changes in Pol II on its CTD may promote termination28,29,30. The CPA machinery also includes a phosphatase module4,31,32. Although the complete composition of this module in plants remains incompletely defined, SSU72 has emerged as a key candidate in Arabidopsis, where it dephosphorylates Ser5-P and Ser7-P on the Pol II CTD2,21,23,33. Its interaction with the cleavage module allows SSU72 to integrate into the CPA machinery and couple CTD dephosphorylation to pre-mRNA 3’ end processing4,34. Interestingly, while SSU72 has been shown to inhibit histone pre-mRNA 3’ end processing in the absence of transcription, it does not impair CPSF-mediated cleavage in vitro35,36,37. Dephosphorylation of the Pol II CTD can promote its transition from monomers to homodimers5. Additionally, BDR proteins (BDR1, BDR2, and BDR3) promote 3’ end Pol II pausing, which may prevent interference with downstream genes, and they interact with the termination factor FPA that associates indirectly with the CPA machinery38,39,40,41. The CPA machinery cleaves the nascent RNA at the poly(A) site4,8,16, exposing an entry point for the 5’ → 3’ exoribonuclease (Rat1 in yeast42, XRN2 in humans43, and AtXRN3 in Arabidopsis44) to degrade the cleaved RNA products45. This degradation allows the exoribonuclease to catch up with elongating Pol II, potentially leading to its release from the DNA template and terminating transcription, as described by the “torpedo” model42,43,46,47,48. Increasing evidence suggests that Pol II conformational changes and exoribonuclease-mediated termination are not mutually exclusive but may act together to ensure efficient transcription termination48,49,50.
Several next-generation sequencing (NGS) based methods have been applied to study transcription termination at the transcriptome-wide scale51,52, such as GRO-seq53, PRO-seq54, Bru-Seq55, NET-seq56, mNET-seq57, and TT-seq58. These methods often involve the isolation of labeled nascent RNA or the pull-down of Pol II-associated RNA by immunoprecipitation, which may lose cleaved and polyadenylated pre-mRNA fractions51. Moreover, the limitations of short-read sequencing make it challenging to capture both cleavage and termination information on the same read. Long-read sequencing platforms like Nanopore or PacBio enable single-molecule characterization of nascent RNA, allowing simultaneous analysis of splicing, cleavage, and transcription termination51,52,59. This advancement has led to the development of methods52,59,60, such as nano-COP in human and Drosophila cells61,62, long-read sequencing of nascent RNA in yeast and mouse cells63,64, POINT-nano and nascONT-seq in human cells65,66, and FLEP-seq from our group in Arabidopsis67,68. Arabidopsis is particularly suitable for studying the coordination between cleavage and termination because these events are usually only several hundred base pairs apart68. In this study, we applied FLEP-seq68 to study 14 mutants of pre-mRNA processing factors in Arabidopsis, generating a comprehensive dataset of over 300 million Nanopore long reads. This resource furthers our understanding of how pre-mRNA processing factors coordinate co-transcriptional cleavage and transcription termination.
Results
Single-molecule nascent RNA profiling of pre-mRNA processing mutants in Arabidopsis
To explore the mechanisms coordinating PAS-dependent co-transcriptional cleavage and transcription termination in protein-coding genes, we characterized mutants of representative pre-mRNA processing factors in Arabidopsis using full-length nascent RNA sequencing (Supplementary Fig. 1a). These factors include components of the CPSF and CstF complexes, nuclear poly(A) polymerases, termination factors PCFS4, FCA and FPA, the CTD phosphatase SSU72, the exoribonuclease AtXRN3, and the Pol II pausing regulators known as BORDER (BDR) proteins. Specifically, we analyzed 14 previously characterized mutants corresponding to these factors, including single mutants for fy, cpsf30, fip1, cpsf100, paps1, cstf64, cstf77, pcfs4, fca, fpa, ssu72, and atxrn3, along with the paps2 paps4 double mutant and the reported bdr1 bdr2 bdr3 triple mutant (Supplementary Fig. 1a and Supplementary Table 1). Mutations in these essential processing factors lead to severe growth and developmental defects in Arabidopsis7,11.
For the mutants described above, we applied FLEP-seq (full-length elongating and polyadenylated RNA sequencing) to uncover the role of these processing factors in transcription termination68,69. The FLEP-seq involves the isolation of nuclear RNA, which is then depleted of ribosomal RNA and converted into double-stranded cDNA for Nanopore sequencing (Fig. 1a). During library preparation, a 3’ adapter is ligated to the RNA transcripts, followed by template-switching reverse transcription, to ensure accurate and complete capture of the transcript. Our method captures transcription termination intermediates by appending the 3’ adapter to their ends (Fig. 1b), allowing us to identify and distinguish full-length readthrough transcripts, 5’ cleavage products, 3’ cleavage products and poly(A) transcripts (Supplementary Fig. 1b). Most of the FLEP-seq libraries were constructed from 12-day-old seedlings, except for the cstf64 and cstf77 mutants, where limited fertility necessitated using 4-week-old aerial parts24,70. We also used a cpsf100 point mutant in the C24 background15, as T-DNA mutants are embryo-lethal in Arabidopsis71. In summary, we constructed 32 FLEP-seq libraries, representing two biological replicates for each of the 13 mutants and their controls, totaling over 300 million raw reads, including ~70 million termination intermediates (Table 1). We also re-analyzed FLEP-seq data from atxrn3 mutants in our published study68. Together, this single-molecule nascent RNA sequencing dataset provides a comprehensive resource for studying mRNA maturation.
a Schematic of the FLEP-seq protocol for isolating nuclear RNA from Arabidopsis for nanopore sequencing. b The schematic of FLEP-seq illustrates the captured transcription termination intermediates. c, Example of readthrough transcriptions in the AT4G36500. The gene structure of AT4G36500 is shown at the top, with a zoomed-in region highlighting the range from 300 nt upstream of the poly(A) site to the longest readthrough distance. The readthrough transcript distribution and termination windows are compared among WT, paps1, and paps2 paps4. A boxplot shows the distribution of all readthrough distances, with the termination window marked with an orange dashed line. The box plots display the median (central line), second to third quartiles (box), and whiskers extending to the minima and maxima. Outliers are indicated as individual points. The combined p-values comparing mutant to WT are shown (p-value = 1.90 × 10−5 for paps1; p-value = 0.98 for paps2 paps4). d Comparison of the termination window lengths between WT and mutants are shown for genes with at least 15 readthrough transcripts. Genes with a significantly increased (red) or decreased (blue) TW length are highlighted (p-value < 0.001). For (c and d), statistical significance was assessed for each replicate using a two-sided Mann–Whitney U test, and the resulting p-values were combined using Fisher’s method. All experiments were performed with two replicates. Source data are provided as a Source Data file.
Key pre-mRNA processing factors have global impacts on transcription termination
Readthrough distance reflects transcription termination efficiency1,3,72. We previously established a method using Nanopore sequencing to quantify the termination window (TW) by calculating the median readthrough distances in protein-coding genes with a single poly(A) site68 (Fig. 1c and Supplementary Fig. 1c). The readthrough distance is the length from the 3’ end of readthrough transcripts to the gene’s poly(A) site. Our TW measurements demonstrated high consistency between biological replicates, with Pearson correlation coefficients (R) for wild type (WT) and mutants typically ranging from 0.64 to 0.94 (median R = 0.85) (Supplementary Fig. 2). We then used changes in TW to quantify differences in termination efficiency between mutants and their corresponding WT. Our analysis revealed that TW ranges from ~50 nt to over 1000 nt, as well as the impact of various factors on overall TW distribution (Supplementary Fig. 3a, Supplementary Data 1). We used Kolmogorov–Smirnov (K-S) tests for statistical significance and Cliff’s Delta (d) for effect size to determine the impact of each mutation on the overall TW distribution. The fca and paps2 paps4 double mutants exhibited patterns similar to the WT (p > 0.05) (Supplementary Fig. 3b). Mutants cpsf30 and fpa exhibited slight shifts toward longer TW (d < 0.147, p < 0.05) (Supplementary Fig. 3b). In contrast, fy, fip1, cpsf100, paps1, bdr1 bdr2 bdr3, ssu72, cstf64, cstf77, pcfs4, and atxrn3 showed more pronounced shifts toward longer TW (d > 0.147, p < 0.05) (Supplementary Fig. 3b). These results exhibited differences in TW length across various mutants, suggesting their potential critical role in regulating transcription termination.
Next, we performed a genome-wide comparison of TW length for all detected genes between mutants and their WT. For this analysis, TW differences for each gene were calculated from the average TW of replicates (Fig. 1d). Many pre-mRNA processing factors globally promote transcription termination (Fig. 1c, d). In fy, cpsf30, and fip1 mutants, hundreds of protein-coding genes showed significant increases in TW length compared to WT (Fig. 1d). About 10% of these genes were co-regulated by FY, CPSF30, and FIP1 (Supplementary Fig. 4a, Supplementary Fig. 4b). In the CPSF complex, FIP1 recruits nuclear poly(A) polymerase to add poly(A) tails to 5’ cleavage products6. It is unclear whether the nuclear poly(A) polymerase regulates transcription termination. Our analysis found that paps1 mutant exhibited increased readthrough transcription in hundreds of genes, while paps2 paps4 double mutant did not (Fig. 1c, d), highlighting the crucial role of PAPS1 in promoting termination. There was a 26% overlap in genes with increased readthrough in both fip1 and paps1 mutants (Supplementary Fig. 4c), suggesting a potential interaction between FIP1 and PAPS1 in promoting transcription termination. In Arabidopsis, FCA interacts with FY to regulate pre-mRNA 3’ end processing11, but fca mutant did not show significant termination defects (Fig. 1d and Supplementary Fig. 3b). In addition, PCFS4 regulates FCA alternative polyadenylation and interacts with FY26. Consistent with the role of PCF11 in promoting termination in mammals27, pcfs4 mutant exhibited increased readthrough transcription (Fig. 1d and Supplementary Fig. 3b).
Mutations in the CPSF cleavage module, such as CPSF100, typically impair termination15,16. Consistent with this, our data showed a significant increase in TW length for 140 genes in cpsf100 mutant (Fig. 1d). The CPSF cleavage module also interacts with the CstF complex and SSU729,34. In the CstF complex, a missense mutation in CstF77 causes readthrough transcription on heat-inducible genes under heat stress73. In our analysis, the cstf64 and cstf77 T-DNA null mutants showed an overall increase in TW length distribution and significant changes in hundreds of protein-coding genes (Fig. 1d and Supplementary Fig. 3b). There was an extensive overlap of 88% between the genes with increased readthrough transcription in cstf64 and cstf77 mutants (Supplementary Fig. 5). The ssu72 mutant also exhibited considerable TW increases (Fig. 1d and Supplementary Fig. 3b). In addition, the deletion of BDR proteins induced readthrough transcription (Fig. 1d and Supplementary Fig. 3b), highlighting their established role in the 3’ end Pol II pausing38. Overall, the number of protein-coding genes with increased TW was consistently considerably higher than those with decreased TW in these mutants (fy, cpsf30, fip1, cpsf100, fpa, paps1, bdr1 bdr2 bdr3, cstf64, cstf77, pcfs4, ssu72, and atxrn3) (Fig. 1d). These observed TW changes reflect the combined influences of multiple processes involved in termination. While factors like BDR proteins and SSU72 directly regulate Pol II by slowing it down or CTD dephosphorylation, and CPSF100 and the CstF complex play a role in co-transcriptional cleavage, and AtXRN3 degrades the uncapped 3’ cleavage product to promote transcription termination, the loss of any of these distinct mechanisms often leads to a broader TW (Fig. 1d). This suggests their collective importance in ensuring efficient termination, although further analyzes are needed to identify the specific contributions of each factor.
Measuring co-transcriptional cleavage activity by single-molecule nascent RNA sequencing
The CPA machinery recognizes the PAS and cleaves pre-mRNA at the poly(A) site4,8. FLEP-seq allows tracking of termination intermediates at a single-molecule level, thus enabling us to characterize cleavage activity at the poly(A) site. We introduced the cleavage index (CI) to quantify cleavage activity, described as the proportion of 5’ cleavage products relative to the sum of full-length readthrough transcripts and 5’ cleavage products (Fig. 2a and Supplementary Data 2). This approach is conceptually similar to the measurement of cleavage efficiency proposed in nascONT-seq66, which calculates the ratio of cleaved reads at the poly(A) site to all reads spanning the poly(A) site. As an example, the gene PSBP-1 (AT1G06680) exhibited 206 full-length readthrough transcripts and 181 5’ cleavage products, resulting in a CI of 0.47 in the WT (Fig. 2b). Biological replicates demonstrated high consistency in CI measurements, with Pearson correlation coefficients (R) for WT and mutants typically ranging from 0.60 to 0.82 (median R = 0.70) (Supplementary Fig. 6).
a Schematic of the cleavage index (CI) calculation. Full-length readthrough transcripts (blue) have a 5’ end more than 50 nt upstream of the poly(A) site and a 3’ end extending beyond 50 nt downstream. The 5’ cleavage products (red) have a 5’ end before 50 nt upstream of the poly(A) site and a 3’ end within 50 nt upstream or downstream of the poly(A) site. The CI is calculated by dividing the number (m) of 5’ cleavage products by the total number (n + m) of full-length readthrough transcripts and 5’ cleavage products. b Example with PSBP-1 (AT1G06680) in WT. The upper panel shows full-length readthrough transcripts, and the bottom panel shows 5’ cleavage products. The number of reads and the corresponding CI are shown. c Comparison of CI between mutants and their corresponding WT. Scatter plots comparing the CI in mutants (x-axis) to that in WT (y-axis) across protein-coding genes. Each dot represents a single gene. The Pearson correlation coefficient (R2) is shown. The effect size of the difference in CI between mutant and WT for each gene was quantified using Cliff’s Delta (d), calculated from the mean CI of the replicates. The significance of the overall shift in CI distribution between WT and each mutant was determined using a two-sided t-test on the CI values of all genes. Source data are provided as a Source Data file.
Next, we used the CI to assess the impact of various processing factors on co-transcriptional cleavage activity. Differences in CI between each mutant and its corresponding WT were quantified using Cliff’s Delta (d), calculated from the average CI of the replicates (Fig. 2c). Efficient assembly of the cleavage module requires a strong interaction between CPSF100 and the endonucleases AtCPSF73-I/II9,15,74,75. Research on cleavage activity in Arabidopsis has been limited due to the lethality caused by the loss of either AtCPSF73-I or AtCPSF73-II14,74. Although the loss of CPSF100 is also lethal in Arabidopsis71, a viable cpsf100 point mutant provides an opportunity for further investigation15. Consistent with its central role in cleavage, we observed a clear decrease in the overall CI (d = −0.35) in the cpsf100 mutant compared to WT (Fig. 2c). Similar reductions were observed in fy (d = −0.28), pcfs4 (d = −0.44), cstf64 (d = −0.44), and cstf77 (d = −0.50) mutants, indicating that FY, PCFS4, and the CstF complex promote co-transcriptional cleavage (Fig. 2c). In contrast, we observed an increase in the overall CI in the bdr1 bdr2 bdr3 triple mutant, suggesting that BDR proteins may inhibit cleavage activity (Fig. 2c). Given that BDR proteins regulate Pol II pausing at the 3’ end38, one possible explanation is an indirect influence on cleavage activity through feedback involving Pol II pausing. Analysis of the ssu72 mutant revealed a negligible change in overall CI (d = 0.02) (Fig. 2c). This finding contrasts with a previous report that SSU72 is required for 3’ end cleavage of pre-mRNA in vitro76. Although SSU72 physically interacts with the cleavage module34, our in vivo result in Arabidopsis is more in line with another previous report showing that the entire phosphatase module, including SSU72, is dispensable for cleavage activity75. In addition, the CI can also be influenced by the efficiency of poly(A) tail synthesis, which converts the 5’ cleavage products into polyadenylated transcripts. In the paps2/4 double mutant, we observed an increased CI (d = −0.12) (Fig. 2c). Considering that PAPS2 and PAPS4 are important polymerases responsible for poly(A) tail synthesis in Arabidopsis18, we speculate that the increased CI is related to a reduced efficiency in poly(A) tail synthesis. For several other mutants, no discernible CI differences were observed in cpsf30 (d = 0.06), fip1 (d = −0.04), fpa (d = 0.08), fca (d = 0.00), paps1 (d = 0.08) and atxrn3 (d = 0.13) compared to WT (Fig. 2c). Our results reveal complex, factor-specific effects on co-transcriptional cleavage, highlighting the diverse roles these factors play in pre-mRNA 3’ end formation.
Different impacts of pre-mRNA processing factors on full-length and cleaved readthrough transcripts
Co-transcriptional cleavage provides an entry point for exoribonuclease-mediated transcription termination16,43,48. To explore how pre-mRNA processing factors affect the readthrough distance of full-length and cleaved readthrough transcripts, we split all readthrough transcripts into full-length readthrough transcripts and 3’ cleavage products based on their 5’ end position (Fig. 3a, b). Reads with a 5’ end located more than 50 nt upstream of the poly(A) site and a 3’ end extending beyond 50 nt downstream of the poly(A) site were defined as full-length readthrough transcripts (Fig. 3a). This criterion accurately identifies different termination intermediates, though it limits the display of the 3’ end distribution of full-length readthrough transcripts to the region beginning 50 nt downstream of the poly(A) site (Fig. 3c). Because the FLEP-seq library preparation selects reads that are 200 nt or longer for Nanopore sequencing, 3’ cleavage products shorter than 200 nt may have been lost during selection, resulting in a peak in the 3’ end distribution ~200 nt downstream of the poly(A) site (Fig. 3c).
a Schematic shows the independent analysis for full-length readthrough transcripts and 3’ cleavage products. Full-length readthrough transcripts are shown in blue, and 3’ cleavage products are shown in orange. b Example with RBCS1A (AT1G67090). Displays various transcription intermediates, including full-length readthrough transcripts, 5’ cleavage products, 3’ cleavage products, and poly(A) transcripts in WT and mutants. Box plots show the distribution of distances from the 3’ end of reads to the gene’s poly(A) site for full-length readthrough transcripts (blue) and 3’ cleavage products (orange). The box plots display the median (central line), second to third quartiles (box), and whiskers extending to the minima and maxima. Outliers are indicated as individual points. The significance of the change in readthrough between each mutant and WT was assessed using a two-sided Mann-Whitney U test. For full-length transcripts, the resulting p-values were 1.5 × 10−4 (fpa) and 0.90 (xrn3); for 3’ cleavage products, the p-values were 0.44 (fpa) and 1.5 × 10−12 (xrn3). Source data are provided as a Source Data file. c Meta-profile of 3’ end distribution. Presents the 3’ end distribution of all readthrough transcripts, full-length readthrough transcripts, and 3’ cleavage products at downstream of the poly(A) site in mutants and corresponding WT. Position 0 on the x-axis represents the poly(A) site. The WT are shown in blue, and mutants in red. The significance of the difference between WT and mutant distributions was determined by a two-sided K-S test. Source data are provided as a Source Data file.
Next, we compared the 3’ end distributions downstream of the poly(A) site across all readthrough transcripts, full-length readthrough transcripts, and 3’ cleavage products between mutants and their WT using the K-S test. No clear changes were observed in the 3’ end distributions of all readthrough transcripts for fca and paps2 paps4 mutants (p > 0.05), suggesting minimal influence on genome-wide transcription termination (Fig. 3c). This is consistent with the observation that there is no significant change in TW length (Fig. 1d and Supplementary Fig. 3b). However, mutants fy, cpsf30, fip1, cpsf100, and paps1 exhibited slight increases in readthrough transcription, as shown by a rightward shift in distributions (Fig. 1d, c). These findings, combined with their role in regulating the TW length of hundreds of protein-coding genes (Fig. 1d), suggest that the CPSF complex and PAPS1 are involved in gene-specific regulation of transcription termination. Loss of BDR proteins and SSU72 resulted in substantial genome-wide increases in full-length readthrough transcription, implying that these factors promote Pol II termination after transcription extends beyond the poly(A) site, and that their role in facilitating transcription termination does not depend on cleavage (Fig. 3c). However, it seems unlikely that the same logic applies to the PCFS4 and CstF complex, which promote cleavage. By contrast, AtXRN3 specifically promoted Pol II termination after co-transcriptional cleavage, preventing readthrough transcription of 3’ cleavage products (Fig. 3b, c and Supplementary Fig. 7). The fpa mutant exhibited a similar pattern to atxrn3, implying FPA may be involved in facilitating the degradation of cleavage products (Fig. 3b, c and Supplementary Fig. 7). Collectively, these analyzes highlight distinct, stage-specific contributions of pre-mRNA processing factors, from influencing elongation after passing through the poly(A) site to promoting the degradation of the 3’ cleavage product.
Interplay between co-transcriptional cleavage and transcription termination
Disrupting co-transcriptional cleavage typically impairs transcription termination more than exoribonuclease deficiency16,17,65,77. However, cleavage alone may not be sufficient to induce termination1,30, suggesting the involvement of additional regulatory mechanisms or interplay among termination factors. To investigate the interaction between co-transcriptional cleavage and transcription termination, we analyzed the Spearman relationship between differences in TW length and changes in CI in protein-coding genes. In the cpsf100 (R = −0.27), fy (R = −0.09), cstf64 (R = −0.34), cstf77 (R = −0.32), and pcfs4 (R = −0.19) mutants, genes with reduced cleavage activity exhibited increased readthrough, indicating a negative correlation (Fig. 4a). This finding is consistent with previous observations of increased readthrough transcription in genes with decreased cleavage efficiency identified by nascONT-seq66. These findings support that they indirectly promote exoribonuclease-mediated termination by facilitating co-transcriptional cleavage, which allows AtXRN3 to degrade 3’ cleavage products (Figs. 2c, 4a). In contrast, the atxrn3 mutant exhibited increased TW length without a corresponding decrease in CI (Fig. 4a), consistent with the accumulation of 3’ cleavage products rather than an increase in full-length readthrough (Fig. 3c). We observed possible feedback in bdr1 bdr2 bdr3 mutant, where transcripts from the same gene locus exhibited two distinct behaviors: while most transcripts were effectively cleaved, a subset might have escaped cleavage and continued transcription along the DNA template (Fig. 3c and Supplementary Fig. 8). In the paps1 mutants, the overall increase in TW length is not accompanied by increased CI (Fig. 4a), and no clear correlation between differences in TW and changes in CI was observed in the cpsf30, fip1, fpa, and fca mutants (Fig. 4a). While factors like CPSF100, FY, CstF64, CstF77, and PCFS4 appear to influence termination efficiency partly through their impact on co-transcriptional cleavage, other factors likely play more specific roles to ensure efficient termination.
a The 2D density plot shows the relationship between the difference in termination window length (x-axis) and the log2 fold change of cleavage index (y-axis) between mutants and WT. Each dot represents a gene. The Spearman correlation coefficient (R) is displayed. The significance of the correlation was assessed using a two-sided t-test on the correlation coefficient. Source data are provided as a Source Data file. A representative example of the sharp transcription termination upstream of tRNA genes in WT (b) and mutants (c). Displays the structure of the protein-coding gene (AT4G36195) and downstream tandemly arranged tRNA gene structure (orange). The zoomed-in region highlights 200 nt upstream of the poly(A) site to the longest readthrough distance. Poly (A) transcripts are shown in purple with poly(A) tail in red. Full-length readthrough transcripts are shown in blue. The termination window is marked with an orange dashed line, and brackets provide the distance between the termination window and the tRNA. The box plots display the median (central line), second to third quartiles (box), and whiskers extending to the minima and maxima. Outliers are indicated as individual points. Data shown are from biological replicate 1 (rep1), with biological replicate 2 (rep2) presented in Supplementary Fig. 9a. Source data are provided as a Source Data file.
The CstF complex is required for the sharp transcription termination upstream of tRNA genes
We previously discovered an efficient transcription termination mechanism for protein-coding genes in Arabidopsis that depends on the presence of an adjacent downstream tRNA gene68. This process is characterized by a sharp termination pattern immediately upstream of the tRNA gene, suggesting these tRNA regions may influence elongation, potentially acting as barriers to readthrough. To determine whether pre-mRNA processing factors influence this sharp transcription termination, we analyzed various mutants in our study.
As previously reported, the gene AT4G36195 serves as a representative example68. When arranged in tandem with a downstream tRNA gene, the readthrough transcription of AT4G36195 sharply terminates upstream of the tRNA gene (Fig. 4b). The distance between the end of the TW and the downstream tRNA is approximately 60 nt (Fig. 4b and Supplementary Fig. 9). In most mutants, this sharp termination upstream of tRNA genes effectively prevented transcriptional interference (Fig. 4c and Supplementary Fig. 9). However, in the cstf64 and cstf77 mutants, this termination pattern was disrupted, resulting in increased full-length readthrough transcription (Fig. 4c and Supplementary Fig. 9). Furthermore, the polyadenylated transcripts of AT4G36195 showed a shift at the poly(A) site from proximal to distal (Fig. 4c). These results indicate that the CstF complex is crucial for maintaining sharp transcription termination upstream of tRNA genes.
Discussion
Our comprehensive single-molecule nascent RNA dataset of 14 mutants reveals that pre-mRNA processing factors differentially influence the coordination between co-transcriptional cleavage and transcription termination. Effective termination is essential to prevent readthrough transcription, which can lead to transcriptional interference and chimeric transcript formation2,3,48,72,78. We found that mutations in the CPSF polyadenylation specificity module had limited effects on overall TW distribution and co-transcriptional cleavage activity. Given the known role of this module in recognizing the PAS6,7,8, we speculate that the observed limited effects in these mutants may be due to defects in PAS recognition. The limited 10% overlap in genes misregulated in fy, cpsf30, and fip1 mutants suggests these components might collaboratively regulate a specific subset of genes particularly sensitive to CPSF complex integrity. This contrasts with the ~88% overlap observed between cstf64 and cstf77 mutants, highlighting the CstF complex’s function as a more tightly integrated unit. Furthermore, the ~26% overlap in genes exhibiting increased readthrough in both fip1 and paps1 mutants points to a more complex relationship: FIP1 and PAPS1 share common targets but also possess distinct regulatory scopes. Although FIP1 bridges CPSF and poly(A) polymerases6, PAPS1’s role in termination may extend beyond its interaction with FIP1, potentially involving other processing factors. Despite interactions with the CPSF polyadenylation specificity module11, the fca mutant did not show significant termination defects. FCA also interacts indirectly with other termination factors, such as CPSF100 and the CstF complex24,41. A previous study has shown that intergenic region expression is slightly changed in fca, significantly increased in fpa, and dramatically increased in the fpa/fca double mutant compared to WT13. Furthermore, recent studies indicate that FCA links transcription termination to chromatin silencing, complicating its role in gene transcription regulation79,80.
We showed CPSF100, CstF64, CstF77, PCFS4, SSU72, AtXRN3, and BDR proteins are key regulators of transcription termination on a genome-wide scale. CstF64, CstF77, PCFS4, SSU72, and BDR proteins broadly contribute to termination, while AtXRN3 relies on cleavage to promote pol II termination. Notably, loss of FPA resulted in a pronounced increase in the readthrough of 3’ cleavage products, similar to observations in atxrn3. Previous studies showed that loss of FPA leads to chimeric RNA formation, and proteomics analyses demonstrated associations between FPA, the CPA machinery, and AtXRN340. These findings suggest that FPA plays a significant role in exoribonuclease-mediated termination. In addition, we calculated the CI as the ratio of 5’ cleavage products to the sum of full-length readthrough transcripts and 5’ cleavage products. Polyadenylated reads were excluded from this calculation to avoid signal dilution from polyadenylated transcripts, minimize artifacts from cytoplasmic contamination, and isolate co-transcriptional cleavage activity from downstream polyadenylation processes. CPSF100, FY, CstF64, CstF77, and PCFS4 enhance co-transcriptional cleavage activity, suggesting they may facilitate the initiation of the exoribonuclease-mediated termination. In contrast, our findings demonstrate that BDR proteins inhibit co-transcriptional cleavage activity. Given that BDR proteins promote 3’ end Pol II pausing38 and mutations in AtXRN3 do not affect 3’ end Pol II pausing81, indicating that 3’ end pausing occurs before AtXRN3-mediated termination and highlighting a connection between co-transcriptional cleavage and 3’ end pausing. These observations suggest that BDR proteins may restrict co-transcriptional cleavage through feedback mechanisms involving Pol II pausing.
Our analysis focused on genes with a single poly(A) site to reveal coordination mechanisms between co-transcriptional cleavage and transcription termination. However, ~70% of Arabidopsis genes undergo alternative polyadenylation (APA), which is regulated by CPA machinery82. For example, CPSF and CstF complexes regulate APA in stress responses10,70,73,83,84, while FY and FCA cooperatively regulate APA to control flowering time85,86. APA can indirectly influence termination efficiency by promoting the use of distal poly(A) sites, thereby extending transcription beyond the canonical TW. To avoid this complexity, our analysis focuses on the direct impact of CPA machinery on cleavage and termination. However, this restricts understanding of how APA contributes to termination dynamics, particularly for genes like FCA or the long noncoding transcripts COOLAIR. Fully resolving APA’s role will necessitate single-molecule resolution of poly(A) site usage. This remains technically challenging due to the low abundance of many APA isoforms and the inherent limitations of long-read sequencing. Many APA events occur at low frequencies, requiring ultra-deep sequencing, and nanopore reads spanning distal poly(A) sites often lack sufficient coverage for quantification. Future advances in real-time, single-molecule technologies will be crucial for studying how APA impacts the coordination between cleavage and termination at competing poly(A) sites.
Finally, while we used a series of Arabidopsis mutant alleles, some of these factors play essential roles in plant growth and development. Consequently, only weak alleles were available for key proteins like FY, CPSF100, PAPS1, and XRN3 (Supplementary Table 1). This limitation may lead to an underestimation of their actual impact on transcription termination and co-transcriptional cleavage, as these mutations might not completely disrupt protein function. In addition, the constitutive absence of these target factors in stable mutants might allow for compensatory adaptation. Emerging plant-optimized degron systems, which enable acute protein degradation87,88,89. Combining these tools with FLEP-seq will help identify direct from downstream effects, improving our understanding of the dynamic interplay in co-transcriptional cleavage and termination. Collectively, the comprehensive datasets on transcription termination from various pre-mRNA processing factor mutants provide a valuable resource for exploring the regulation of transcription termination and its role in mRNA maturation.
Methods
Plant materials and growth conditions
Arabidopsis plants were used in this study. The mutant lines of Arabidopsis used were Col-0 and C24 ecotype backgrounds in this study. Mutants used in the Col-0 background included T-DNA insertion mutants fy-5 (SALK_005697)10, fip1-3 (SALK_087117)12, fpa-9 (SALK_011615)68, cstf64-2 (SAIL_794_G11)24, cstf77-2 (GK_136D03)24, pcfs4-1 (SALK_102934)26, ssu72-2 (SALK_059245)33, paps1-4 (WiscDsLox441G5)20, point mutants cpsf3090 and fca-980, a double mutant paps2 paps4 (paps2-1: SALK_126395; paps4-1: SALK_007979)19 and a triple mutant bdr1 bdr2 bdr3 (bdr1-1, SALK_142108C; bdr2-1, WISCDSLOX352H03; and bdr3-1, SALK_059905C)38. In the C24 ecotype background, esp5-1 (referred to as cpsf100 in this study) and its marker transgenic plant (with construct amp311, designated as WT in the experiments with cpsf100) have been previously reported15. Primer sequences used for genotyping are listed (Supplementary Table 2).
For mutants cpsf30, fy, fip1, cpsf100, paps1, paps2 paps4, fca, fpa, pcfs4, ssu72, and bdr1 bdr2 bdr3, 12-day-old seedlings were used for FLEP-seq library construction. Seeds were stratified on 1/2 Murashige and Skoog (MS) plates at 4 °C for 2 days. Plates were then placed vertically, and seedlings were grown at 22 °C for 12 days (16 h light - 8 h dark) before harvest. For mutants cstf64, cstf77, and the corresponding WT, 7-day-old seedlings were transferred from 1/2 MS plates to the soil-vermiculite mixture and grown at 22 °C (16 h light–8 h dark) for 4 weeks. The aerial parts were then harvested. All harvested samples were immediately frozen in liquid nitrogen and stored at −80 °C for subsequent experiments.
Nuclear RNA isolation and library construction
To isolate nuclear RNA, ~1 g of plant tissue was ground to a powder in liquid nitrogen and homogenized in ice-cold Honda buffer (0.44 M sucrose, 1.25% (w/v) Ficoll, 2.5% (w/v) dextran T40, 20 mM HEPES-KOH pH 7.4, 10 mM MgCl₂, 0.5% (w/v) Triton X-100, 1 mM DTT, 1× protease inhibitor, and 100 ng/μl tRNA)91. The homogenate was filtered through Miracloth, and the filtrate was centrifuged (2000 × g, 5 min, 4 °C) to pellet the nuclei. The nuclear pellet was then washed twice with Honda buffer, transferred to a 1.5 ml microcentrifuge tube, and re-pelleted by centrifugation (8000 × g, 1 min, 4 °C). After completely removing the supernatant, nuclear RNA was extracted using the RNAprep Pure Plant Plus Kit (TIANGEN, DP441) according to the manufacturer’s instructions.
For library construction, ribosomal RNA (rRNA) was depleted from nuclear RNA using pan-plant riboPOOLs probes (siTOOLs Biotech) and Dynabeads Myone Streptavidin C1 beads (Thermo Fisher, no. 65001). The rRNA-depleted RNA was subsequently purified with the RNA Clean & Concentrator-5 kit (ZYMO, R1013). A 3’ adapter (5’-rAppCTGTAGGCACCATCAAT–NH₂-3’; NEB, S1315S) was ligated to the RNA in a 10-h reaction at 16 °C using T4 RNA ligase 2, truncated K227Q (NEB, M0242). Following the Oxford Nanopore Technologies (ONT) PCR-cDNA sequencing kit protocol (SQK-PCS109), first-strand cDNA was synthesized using Maxima H Minus Reverse Transcriptase (Thermo Fisher, EP0752) and a custom reverse transcription primer (100 nM, 5’-phos/ACTTGCCTGTCGCTCTATCTTCATTGATGGTGCCTACAG-3’)69. To minimize amplification bias, the optimal number of PCR cycles was determined before amplifying the library for 10–16 cycles with PrimeSTAR GXL DNA Polymerase (TaKaRa, R050A). The amplified cDNA library was purified with 0.8×AMPure XP beads (Beckman, A63880), and the final concentration was quantified using a Qubit fluorometer. For sequencing, 100 fmol of the library was loaded onto an R9.4 flow cell and sequenced on a MinION device.
Nanopore data processing
Data processing followed the FLEP-seq pipeline (https://github.com/ZhaiLab-SUSTech/FLEPSeq)68,69. The Guppy basecaller (v4.0.11) was used to convert raw nanopore signals into sequences with the parameters --c dna_r9.4.1_450bps_hac.cfg and --qscore_filtering. Reads with a quality score > 7 were mapped to the Arabidopsis TAIR10 genome using Minimap2 (v2.10-r761)92 and the Araport11 reference annotation (https://www.arabidopsis.org/download/file?path=Genes/Araport11_genome_release/archived/Araport11_GFF3_genes_transposons.Jun2016.gff.gz). Alignment parameters included “-ax splice, --secondary=no, and -G 12000”. Samtools93 was used to filter out unmapped reads, non-primary alignment reads, supplementary alignment reads, and reads with a MAPQ less than 1, using -F 2308 and -q 1. We extracted the soft-clipped sequences at the 5’ or 3’ ends of reads, along with the flanking 20 nt of mapped sequences, to search for the Strand Switching Primers sequence (5’-AAGCAGTGGTATCAACGCAGAGTACATGGG-3’) and the custom 3’ adapter sequence (5’-ATTGATGGTGCCTACAG-3’) using our custom script adapterFinder.py. Read integrity was verified by confirming the presence of both sequences and their strand information. Only reads containing both sequences were used for subsequent analysis. The length of poly(A) tails in nanopore reads was estimated using PolyAcaller to distinguish polyadenylated transcripts from non-poly(A) reads based on the length of the poly(A) tails68,69.
Classification of RNA intermediates produced during transcription termination
As previously described68, reads were classified as poly(A) transcripts if the poly(A) tail length was 15 nt or longer; those with shorter tails were considered non-polyadenylated. To eliminate sequencing errors, reads with a less than 5 nt distance between the mapping region and the 3’ adapter were also classified as non-polyadenylated transcripts. We used poly(A) sites identified from our previous study68. For non-polyadenylated reads, we extracted their 5’ and 3’ end coordinates from BAM files using Pysam (v0.22.1)93 for more precise classification: i) The full-length readthrough transcripts were defined as read whose 5’ end located more than 50 nt upstream of the poly(A) site and its 3’ end extending more than 50 nt downstream of the poly(A) site; ii) The 5’ cleavage products were defined as read whose 5’ end located more than 50 nt upstream of the poly(A) site and 3’ end located in the region from −50 to +50 nt relative to the poly(A) site; iii) The 3’ cleavage product was defined as read whose 5’ end positioned within 200 nt downstream of the poly(A) site. If two neighboring genes were closely aligned on the same strand (less than 200 nt), the 3’ cleavage products were only used if the 5’ end was located before the transcription start site of the downstream gene.
Calculation of termination window length and cleavage index
To calculate the readthrough distance, we analyzed all readthrough transcripts, defining this distance as the span from the 3’ end of readthrough transcripts to the poly(A) site. Only protein-coding genes with a single poly(A) site and at least 15 readthrough transcripts were included in subsequent analyses. The median readthrough distance was used to represent the TW length for each gene. To quantify the cleavage activity at poly(A) sites, we calculated the CI for representative poly(A) sites in both WT and mutants. We counted the number of reads corresponding to full-length readthrough transcripts, 5’ cleavage products, 3’ cleavage products, and poly(A) transcripts for each poly(A) site. The CI was calculated by dividing the number of 5’ cleavage products by the sum of full-length readthrough transcripts and 5’ cleavage products. Only genes with a combined total of more than 15 full-length readthrough transcripts and 3’ cleavage products were included for CI analyses.
Statistical analysis
To compare the overall TW distributions and 3’ end distributions between mutant and WT, K-S tests were performed. The effect size for TW distribution comparisons was determined using Cliff’s Delta (d). Pearson’s correlation coefficient was used to measure statistical relationships. For gene-specific TW analysis, differences between mutant and WT were calculated using the average TW from two biological replicates. The significance of these gene-specific TW differences was assessed by independently performing the Mann-Whitney U test on each replicate. The resulting p-values from replicate tests were combined using Fisher’s method, with statistical significance defined as p < 0.001. Differences in CI between mutant and WT were quantified using Cliff’s Delta (d), calculated from the average CI of the replicates. Furthermore, the Spearman correlation was used to analyze the relationship between differences in TW length and changes in CI in protein-coding genes.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The FLEP-seq data generated in this study have been deposited in the Genome Sequence Archive at the National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, under accession number CRA019307. The atxrn3 FLEP-seq data were obtained from previously published study (GSA: CRA005016). Source data are provided in this paper. Source data are provided with this paper.
Code availability
The code for analysis and visualization of this study has been uploaded to GitHub (https://github.com/ZhaiLab-SUSTech/Termination_project)94.
References
Porrua, O. & Libri, D. Transcription termination and the control of the transcriptome: why, where and how to stop. Nat. Rev. Mol. Cell Biol. 16, 190–202 (2015).
Rodríguez-Molina, J. B., West, S. & Passmore, L. A. Knowing when to stop: transcription termination on protein-coding genes by eukaryotic RNAPII. Mol. Cell 83, 404–415 (2023).
Proudfoot, N. J. Transcriptional termination in mammals: stopping the RNA polymerase II juggernaut. Science 352, aad9926 (2016).
Boreikaitė, V. & Passmore, L. A. 3′-End Processing of eukaryotic mRNA: machinery, regulation, and impact on gene expression. Annu. Rev. Biochem. 92, 199–225 (2023).
Carminati, M., Rodríguez-Molina, J. B., Manav, M. C., Bellini, D. & Passmore, L. A. A direct interaction between CPF and RNA Pol II links RNA 3′ end processing to transcription. Mol. Cell 83, 4461–4478.e13 (2023).
Tian, B. & Manley, J. L. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 18, 18–30 (2017).
Lin, J. & Li, Q. Q. Coupling epigenetics and RNA polyadenylation: missing links. Trends Plant Sci. 28, 223–234 (2023).
Mitschka, S. & Mayr, C. Context-specific regulation and function of mRNA alternative polyadenylation. Nat. Rev. Mol. Cell Biol. 23, 779–796 (2022).
Zhang, Y., Sun, Y., Shi, Y., Walz, T. & Tong, L. Structural insights into the human pre-mRNA 3′-end processing machinery. Mol. Cell 77, 800–809.e6 (2020).
Yu, Z., Lin, J. & Li, Q. Q. Transcriptome analyses of FY mutants reveal its role in mRNA alternative polyadenylation. Plant Cell 31, 2332–2352 (2019).
Wu, Z., Fang, X., Zhu, D. & Dean, C. Autonomous pathway: FLOWERING LOCUS C repression through an antisense-mediated chromatin-silencing mechanism. Plant Physiol. 182, 27–37 (2020).
Li, Y. et al. The Arabidopsis pre-mRNA 3′ end processing-related protein FIP1 promotes seed dormancy via the DOG1 and ABA pathways. Plant J. 115, 494–509 (2023).
Sonmez, C. et al. RNA 3′ processing functions of Arabidopsis FCA and FPA limit intergenic transcription. Proc. Natl. Acad. Sci. 108, 8508–8513 (2011).
Zhao, H., Xing, D. & Li, Q. Q. Unique features of plant cleavage and polyadenylation specificity factor revealed by proteomic studies. Plant Physiol. 151, 1546–1556 (2009).
Lin, J., Xu, R., Wu, X., Shen, Y. & Li, Q. Q. Role of cleavage and polyadenylation specificity factor 100: anchoring poly(A) sites and modulating transcription termination. Plant J. 91, 829–839 (2017).
Liu, H. & Moore, C. L. On the cutting edge: regulation and therapeutic potential of the mRNA 3′ end nuclease. Trends Biochem. Sci. 46, 772–784 (2021).
Eaton, J. D. et al. Xrn2 accelerates termination by RNA polymerase II, which is underpinned by CPSF73 activity. Genes Dev. 32, 127–139 (2018).
Vi, S. L. et al. Target specificity among canonical nuclear poly(A) polymerases in plants modulates organ growth and pathogen response. Proc. Natl Acad. Sci. 110, 13994–13999 (2013).
Czesnick, H. & Lenhard, M. Antagonistic control of flowering time by functionally specialized poly(A) polymerases in Arabidopsis thaliana. Plant J. 88, 570–583 (2016).
Trost, G. et al. Arabidopsis poly(A) polymerase PAPS1 limits founder-cell recruitment to organ primordia and suppresses the salicylic acid-independent immune response downstream of EDS1/PAD4. Plant J. 77, 688–699 (2014).
Hsin, J.-P. & Manley, J. L. The RNA polymerase II CTD coordinates transcription and RNA processing. Genes Dev. 26, 2119–2137 (2012).
Osman, S. & Cramer, P. Structural biology of RNA polymerase II transcription: 20 years on. Annu. Rev. Cell Dev. Biol. 36, 1–34 (2020).
Hajheidari, M., Koncz, C. & Eick, D. Emerging roles for RNA polymerase II CTD in Arabidopsis. Trends Plant Sci. 18, 633–643 (2013).
Liu, F., Marquardt, S., Lister, C., Swiezewski, S. & Dean, C. Targeted 3′ processing of antisense transcripts triggers Arabidopsis FLC chromatin silencing. Science 327, 94–97 (2010).
Glover-Cutter, K., Kim, S., Espinosa, J. & Bentley, D. L. RNA polymerase II pauses and associates with pre-mRNA processing factors at both ends of genes. Nat. Struct. Mol. Biol. 15, 71–78 (2008).
Xing, D., Zhao, H., Xu, R. & Li, Q. Q. Arabidopsis PCFS4, a homologue of yeast polyadenylation factor Pcf11p, regulates FCA alternative processing and promotes flowering time. Plant J. 54, 899–910 (2008).
Kamieniarz-Gdula, K. et al. Selective roles of vertebrate PCF11 in premature and full-length transcript termination. Mol. Cell 74, 158–172.e9 (2019).
Logan, J., Falck-Pedersen, E., Darnell, J. E. & Shenk, T. A poly(A) addition site and a downstream termination region are required for efficient cessation of transcription by RNA polymerase II in the mouse beta maj-globin gene. Proc. Natl. Acad. Sci. 84, 8306–8310 (1987).
Kim, M., Ahn, S., Krogan, N. J., Greenblatt, J. F. & Buratowski, S. Transitions in RNA polymerase II elongation complexes at the 3′ ends of genes. EMBO J. 23, 354–364 (2004).
Zhang, H., Rigo, F. & Martinson, H. G. Poly(A) Signal-dependent transcription termination occurs through a conformational change mechanism that does not require cleavage at the poly(A) site. Mol. Cell 59, 437–448 (2015).
Casañal, A. et al. Architecture of eukaryotic mRNA 3′-end processing machinery. Science 358, 1056–1059 (2017).
Kumar, A., Clerici, M., Muckenfuss, L. M., Passmore, L. A. & Jinek, M. Mechanistic insights into mRNA 3′-end processing. Curr. Opin. Struct. Biol. 59, 143–150 (2019).
Tian, Y. et al. PRC2 recruitment and H3K27me3 deposition at FLC require FCA binding of COOLAIR. Sci. Adv. 5, eaau7246 (2019).
Xiang, K. et al. Crystal structure of the human symplekin–Ssu72–CTD phosphopeptide complex. Nature 467, 729–733 (2010).
Boreikaite, V., Elliott, T. S., Chin, J. W. & Passmore, L. A. RBBP6 activates the pre-mRNA 3′ end processing machinery in humans. Genes Dev. 36, 210–224 (2022).
Schmidt, M. et al. Reconstitution of 3′ end processing of mammalian pre-mRNA reveals a central role of RBBP6. Genes Dev. 36, 195–209 (2022).
Sun, Y. et al. Structure of an active human histone pre-mRNA 3′-end processing machinery. Science 367, 700–703 (2020).
Yu, X., Martin, P. G. P. & Michaels, S. D. BORDER proteins protect expression of neighboring genes by promoting 3′ Pol II pausing in plants. Nat. Commun. 10, 4359 (2019).
Yu, X. et al. The BORDER family of negative transcription elongation factors regulates flowering time in Arabidopsis. Curr. Biol. 31, 5377–5384.e5 (2021).
Parker, M. T. et al. Widespread premature transcription termination of Arabidopsis thaliana NLR genes by the spen protein FPA. eLife 10, e65537 (2021).
Fang, X. et al. Arabidopsis FLL2 promotes liquid–liquid phase separation of polyadenylation complexes. Nature 569, 265–269 (2019).
Kim, M. et al. The yeast Rat1 exonuclease promotes transcription termination by RNA polymerase II. Nature 432, 517–522 (2004).
West, S., Gromak, N. & Proudfoot, N. J. Human 5′ → 3′ exonuclease Xrn2 promotes transcription termination at co-transcriptional cleavage sites. Nature 432, 522–525 (2004).
Krzyszton, M. et al. Defective XRN 3-mediated transcription termination in Arabidopsis affects the expression of protein-coding genes. Plant J. 93, 1017–1031 (2018).
Cortazar, M. A. et al. Xrn2 substrate mapping identifies torpedo loading sites and extensive premature termination of RNA pol II transcription. Genes Dev. 36, 1062–1078 (2022).
Zeng, Y., Zhang, H.-W., Wu, X.-X. & Zhang, Y. Structural basis of exoribonuclease-mediated mRNA transcription termination. Nature 628, 887–893 (2024).
Connelly, S. & Manley, J. L. A functional mRNA polyadenylation signal is required for transcription termination by RNA polymerase II. Genes Dev. 2, 440–452 (1988).
Eaton, J. D. & West, S. Termination of transcription by RNA Polymerase II: BOOM!. Trends Genet. 36, 664–675 (2020).
Luo, W., Johnson, A. W. & Bentley, D. L. The role of Rat1 in coupling mRNA 3′-end processing to transcription termination: implications for a unified allosteric–torpedo model. Genes Dev. 20, 954–965 (2006).
Cortazar, M. A. et al. Control of RNA Pol II speed by PNUTS-PP1 and Spt5 dephosphorylation facilitates termination by a “Sitting Duck Torpedo” mechanism. Mol. Cell 76, 896–908.e4 (2019).
Wissink, E. M., Vihervaara, A., Tippens, N. D. & Lis, J. T. Nascent RNA analyses: tracking transcription and its regulation. Nat. Rev. Genet. 20, 705–723 (2019).
Shine, M. et al. Co-transcriptional gene regulation in eukaryotes and prokaryotes. Nat. Rev. Mol. Cell Biol. 25, 534–554 (2024).
Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848 (2008).
Kwak, H., Fuda, N. J., Core, L. J. & Lis, J. T. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science 339, 950–953 (2013).
Paulsen, M. T. et al. Coordinated regulation of synthesis and stability of RNA during the acute TNF-induced proinflammatory response. Proc. Natl. Acad. Sci. 110, 2240–2245 (2013).
Churchman, L. S. & Weissman, J. S. Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature 469, 368–373 (2011).
Nojima, T. et al. Mammalian NET-seq reveals genome-wide nascent transcription coupled to RNA processing. Cell 161, 526–540 (2015).
Schwalb, B. et al. TT-seq maps the human transient transcriptome. Science 352, 1225–1228 (2016).
Merens, H. E., Choquet, K., Baxter-Koenigs, A. R. & Churchman, L. S. Timing is everything: advances in quantifying splicing kinetics. Trends Cell Biol. 34, 968–981 (2024).
Nojima, T. & Proudfoot, N. J. Mechanisms of lncRNA biogenesis as revealed by nascent transcriptomics. Nat. Rev. Mol. Cell Biol. 23, 389–406 (2022).
Drexler, H. L. et al. Revealing nascent RNA processing dynamics with nano-COP. Nat. Protoc. 16, 1343–1375 (2021).
Drexler, H. L., Choquet, K. & Churchman, L. S. Splicing kinetics and coordination revealed by direct nascent RNA sequencing through nanopores. Mol. Cell 77, 985–998.e8 (2020).
Reimer, K. A., Mimoso, C. A., Adelman, K. & Neugebauer, K. M. Co-transcriptional splicing regulates 3′ end cleavage during mammalian erythropoiesis. Mol. Cell 81, 998–1012.e7 (2021).
Herzel, L., Straube, K. & Neugebauer, K. M. Long-read sequencing of nascent RNA reveals coupling among RNA processing events. Genome Res. 28, 1008–1019 (2018).
Sousa-Luís, R. et al. POINT technology illuminates the processing of polymerase-associated intact nascent transcripts. Mol. Cell 81, 1935–1950.e6 (2021).
Arnold, M., Bressin, A., Jasnovidova, O., Meierhofer, D. & Mayer, A. A BRD4-mediated elongation control point primes transcribing RNA polymerase II for 3′-processing and termination. Mol. Cell 81, 3589–3603.e13 (2021).
Jia, J. et al. Post-transcriptional splicing of nascent RNA contributes to widespread intron retention in plants. Nat. Plants 6, 780–788 (2020).
Mo, W. et al. Landscape of transcription termination in Arabidopsis revealed by single-molecule nascent RNA sequencing. Genome Biol. 22, 322 (2021).
Jia, J. et al. An atlas of plant full-length RNA reveals tissue-specific and monocots–dicots conserved regulation of poly(A) tail length. Nat. Plants 8, 1118–1126 (2022).
Zeng, W. et al. Modulation of auxin signaling and development by polyadenylation machinery. Plant Physiol. 179, 686–699 (2019).
Tzafrir, I. et al. Identification of genes required for embryo development in arabidopsis. Plant Physiol. 135, 1206–1220 (2004).
Rosa-Mercado, N. A. & Steitz, J. A. Who let the DoGs out? – Biogenesis of stress-induced readthrough transcripts. Trends Biochem. Sci. 47, 206–217 (2022).
Kim, M., Swenson, J., McLoughlin, F. & Vierling, E. Mutation of the polyadenylation complex subunit CstF77 reveals that mRNA 3’ end formation and HSP101 levels are critical for a robust heat stress response. Plant Cell 35, 924–941 (2023).
Xu, R. et al. The 73 kD Subunit of the cleavage and polyadenylation specificity factor (CPSF) complex affects reproductive development in Arabidopsis. Plant Mol. Biol. 61, 799–815 (2006).
Hill, C. H. et al. Activation of the endonuclease that defines mRNA 3′ ends requires incorporation into an 8-subunit core cleavage and polyadenylation factor complex. Mol. Cell 73, 1217–1231.e11 (2019).
He, X. et al. Functional interactions between the transcription and mRNA 3′ end processing machineries mediated by Ssu72 and Sub1. Genes Dev. 17, 1030–1042 (2003).
Eaton, J. D., Francis, L., Davidson, L. & West, S. A unified allosteric/torpedo mechanism for transcriptional termination on human protein-coding genes. Genes Dev. 34, 132–145 (2020).
Grzechnik, P. & Mischo, H. Fateful decisions of where to cut the line: pathology associated with aberrant 3’ end processing and transcription termination. J. Mol. Biol. 437, 168802 (2025).
Menon, G. et al. Proximal termination generates a transcriptional state that determines the rate of establishment of Polycomb silencing. Mol. Cell 84, 2255–2271.e9 (2024).
Mateo-Bonmatí, E. et al. A CPF-like phosphatase module links transcription termination to chromatin silencing. Mol. Cell 84, 2272–2286.e7 (2024).
Zhou, S. et al. Coupling of co-transcriptional splicing and 3’ end Pol II pausing during termination in Arabidopsis. Genome Biol. 24, 206 (2023).
Wu, X. et al. Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation. Proc. Natl. Acad. Sci. 108, 12533–12538 (2011).
Liu, M. et al. Integration of developmental and environmental signals via a polyadenylation factor in Arabidopsis. PLOS One 9, e115779 (2014).
Téllez-Robledo, B. et al. The polyadenylation factor FIP1 is important for plant development and root responses to abiotic stresses. Plant J. 99, 1203–1219 (2019).
Simpson, G. G., Dijkwel, P. P., Quesada, V., Henderson, I. & Dean, C. FY Is an RNA 3′ end-processing factor that interacts with FCA to control the arabidopsis floral transition. Cell 113, 777–787 (2003).
Whittaker, C. & Dean, C. The FLC locus: a platform for discoveries in epigenetics and adaptation. Annu. Rev. Cell Dev. Biol. 33, 555–575 (2017).
Nishimura, K., Fukagawa, T., Takisawa, H., Kakimoto, T. & Kanemaki, M. An auxin-based degron system for the rapid depletion of proteins in nonplant cells. Nat. Methods 6, 917–922 (2009).
Faden, F. et al. Phenotypes on demand via switchable target protein degradation in multicellular organisms. Nat. Commun. 7, 12202 (2016).
Huang, L. & Rojas-Pierce, M. Rapid depletion of target proteins in plants by an inducible protein degradation system. Plant Cell 36, 3145–3161 (2024).
Wei, Y. et al. Characterizing the impact of CPSF30 gene disruption on TuMV infection in Arabidopsis thaliana. GM Crops Food 15, 1–17 (2024).
Long, Y., Jia, J., Mo, W., Jin, X. & Zhai, J. FLEP-seq: simultaneous detection of RNA polymerase II position, splicing status, polyadenylation site and poly(A) tail length at genome-wide scale by single-molecule nascent RNA sequencing. Nat. Protoc. 16, 4355–4381 (2021).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Li, J. Pre-mRNA processing factors differentially impact coordination between co-transcriptional cleavage and transcription termination. Zenodo https://doi.org/10.5281/ZENODO.15828290 (2025).
Acknowledgements
The group of J.Z. is supported by the Biological Breeding-National Science and Technology Major Project (2023ZD04073), National Natural Science Foundation of China (32325031), the Program for Guangdong Introducing Innovative and Entrepreneurial Teams (2016ZT06S172); Y.L. is supported by the National Natural Science Foundation of China (32300479) and Natural Science Foundation of Guangdong Province of China (2023A1515011997); Q.Q.L. is supported in part by a US National Science Foundation grant (2347540); Z.L. is supported by the Fundamental Research Fund for Central Universities (2412023YQ005). This work was supported by the Shenzhen Science and Technology Program (Grant No. ZDSYS20230626091659010).
Author information
Authors and Affiliations
Contributions
J.Z., X.J., and J.L. conceived and designed the experiments. X.J., J.L., W.L., Y.W., Y.S., B.L., Y.L., and X.Z. performed the experiments. X.J., J.L., and Z.L. analyzed the data. X.D., Q.F., Y.X., Q.Q.L., M.L., S.D.M., and X.C. provided materials and conceptual insights. J.Z. oversaw the study. X.J., J.L. and J.Z. wrote the manuscript, and all authors revised the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Eduardo Mateo-Bonmatí, Sayeh Gorjifard and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Jin, X., Li, J., Lu, W. et al. Pre-mRNA processing factors differentially impact coordination between co-transcriptional cleavage and transcription termination. Nat Commun 16, 7086 (2025). https://doi.org/10.1038/s41467-025-62555-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-025-62555-7