Abstract
Promoter-proximal pausing of RNA polymerase (Pol) II is a key regulatory step during transcription. Despite the central role of pausing in gene regulation, we do not understand the evolutionary processes that led to the emergence of Pol II pausing or its transition to a rate-limiting step actively controlled by transcription factors. Here, we analyzed transcription in species across the tree of life. Unicellular eukaryotes display an accumulation of Pol II near transcription start sites, which we propose transitioned to the longer-lived, focused pause observed in metazoans. This transition coincided with the evolution of new subunits in the negative elongation factor (NELF) and 7SK complexes. Depletion of NELF in mammals shifted the promoter-proximal buildup of Pol II from the pause site into the early gene body and compromised transcriptional activation for a set of heat-shock genes. Our work details the evolutionary history of Pol II pausing and sheds light on how new transcriptional regulatory mechanisms evolve.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
Tables in CSV format can be downloaded from GitHub (https://github.com/alexachivu/PauseEvolution_prj). Data generated in this study were deposited to the Gene Expression Omnibus under accession number GSE223913. Source data are provided with this paper.
Code availability
Custom code for analyzing sequencing data can be found on GitHub (https://github.com/alexachivu/PauseEvolution_prj/).
References
Scholes, C., DePace, A. H. & Sánchez, Á Combinatorial gene regulation through kinetic control of the transcription cycle. Cell Syst. 4, 97–108.e9 (2017).
Mikhailov, K. V. et al. The origin of Metazoa: a transition from temporal to spatial cell differentiation. Bioessays 31, 758–768 (2009).
Arenas-Mena, C. Indirect development, transdifferentiation and the macroregulatory evolution of metazoans. Philos. Trans. R. Soc. Lond. B Biol. Sci. 365, 653–669 (2010).
Arenas-Mena, C. The origins of developmental gene regulation. Evol. Dev. 19, 96–107 (2017).
Core, L. & Adelman, K. Promoter-proximal pausing of RNA polymerase II: a nexus of gene regulation. Genes Dev. 33, 960–982 (2019).
Adelman, K. & Lis, J. T. Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nat. Rev. Genet. 13, 720–731 (2012).
Shen, X.-X. et al. Tempo and mode of genome evolution in the budding yeast subphylum. Cell 175, 1533–1545 (2018).
Nagy, L. G. et al. Latent homology and convergent regulatory evolution underlies the repeated emergence of yeasts. Nat. Commun. 5, 1–8 (2014).
Vill, A. C., Rice, E. J., De Vlaminck, I., Danko, C. G. & Brito, I. L. Precision run-on sequencing (PRO-seq) for microbiome transcriptomics. Nat. Microbiol. 9, 241–250 (2024).
Booth, G. T., Wang, I. X., Cheung, V. G. & Lis, J. T. Divergence of a conserved elongation factor and transcription regulation in budding and fission yeast. Genome Res. 26, 799–811 (2016).
Rougvie, A. E. & Lis, J. T. The RNA polymerase II molecule at the 5′ end of the uninduced hsp70 gene of D. melanogaster is transcriptionally engaged. Cell 54, 795–804 (1988).
Muse, G. W. et al. RNA polymerase is poised for activation across the genome. Nat. Genet. 39, 1507–1511 (2007).
Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848 (2008).
Jonkers, I., Kwak, H. & Lis, J. T. Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons. eLife 3, e02407 (2014).
Danko, C. G. et al. Signaling pathways differentially affect RNA polymerase II initiation, pausing, and elongation rate in cells. Mol. Cell 50, 212–222 (2013).
Zeitlinger, J. et al. RNA polymerase stalling at developmental control genes in the Drosophila melanogaster embryo. Nat. Genet. 39, 1512–1516 (2007).
Abuhashem, A. et al. RNA Pol II pausing facilitates phased pluripotency transitions by buffering transcription. Genes Dev. 36, 770–789 (2022).
Williams, L. H. et al. Pausing of RNA polymerase II regulates mammalian developmental potential through control of signaling networks. Mol. Cell 58, 311–322 (2015).
Buckley, M. S., Kwak, H., Zipfel, W. R. & Lis, J. T. Kinetics of promoter Pol II on Hsp70 reveal stable pausing and key insights into its regulation. Genes Dev. 28, 14–19 (2014).
Aoi, Y. et al. SPT6 functions in transcriptional pause/release via PAF1C recruitment. Mol. Cell 82, 3412–3423 (2022).
Marshall, N. F. & Price, D. H. Purification of P-TEFb, a transcription factor required for the transition into productive elongation. J. Biol. Chem. 270, 12335–12338 (1995).
Li, Q. et al. Analysis of the large inactive P-TEFb complex indicates that it contains one 7SK molecule, a dimer of HEXIM1 or HEXIM2, and two P-TEFb molecules containing Cdk9 phosphorylated at threonine 186. J. Biol. Chem. 280, 28819–28826 (2005).
Booth, G. T., Parua, P. K., Sansó, M., Fisher, R. P. & Lis, J. T. Cdk9 regulates a promoter-proximal checkpoint to modulate RNA polymerase II elongation rate in fission yeast. Nat. Commun. 9, 543 (2018).
Kwak, H., Fuda, N. J., Core, L. J. & Lis, J. T. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science 339, 950–953 (2013).
Kruesi, W. S., Core, L. J., Waters, C. T., Lis, J. T. & Meyer, B. J. Condensin controls recruitment of RNA polymerase II to achieve nematode X-chromosome dosage compensation. eLife 2, e00808 (2013).
Hetzel, J., Duttke, S. H., Benner, C. & Chory, J. Nascent RNA sequencing reveals distinct features in plant transcription. Proc. Natl Acad. Sci. USA 113, 12316–12321 (2016).
Lozano, R. et al. RNA polymerase mapping in plants identifies intergenic regulatory elements enriched in causal variants. G3 (Bethesda) 11, jkab273 (2021).
Goliasse, M. et al. Uncovering the multi-layer cis-regulatory landscape of rice via integrative nascent RNA analysis. Genome Biol. 26, 250 (2025).
Wang, Z. et al. Prediction of histone post-translational modification patterns based on nascent transcription data. Nat. Genet. 54, 295–305 (2022).
Zhao, Y., Liu, L., Hassett, R. & Siepel, A. Model-based characterization of the equilibrium dynamics of transcription initiation and promoter-proximal pausing in human cells. Nucleic Acids Res. 51, gkad843 (2023).
Chu, T. et al. Chromatin run-on and sequencing maps the transcriptional regulatory landscape of glioblastoma multiforme. Nat. Genet. 50, 1553–1564 (2018).
Lee, C. et al. NELF and GAGA factor are linked to promoter-proximal pausing at many genes in Drosophila. Mol. Cell. Biol. 28, 3290–3300 (2008).
Ni, Z. et al. P-TEFb is critical for the maturation of RNA polymerase II into productive elongation in vivo. Mol. Cell. Biol. 28, 1161–1170 (2008).
Wu, C.-H. et al. NELF and DSIF cause promoter proximal pausing on the hsp70 promoter in Drosophila. Genes Dev. 17, 1402–1414 (2003).
Vos, S. M. et al. Architecture and RNA binding of the human negative elongation factor. eLife 5, e14981 (2016).
Narita, T. et al. Human transcription elongation factor NELF: identification of novel subunits and reconstitution of the functionally active complex. Mol. Cell. Biol. 23, 1863–1873 (2003).
Gilchrist, D. A. et al. NELF-mediated stalling of Pol II can enhance gene expression by blocking promoter-proximal nucleosome assembly. Genes Dev. 22, 1921–1933 (2008).
Vos, S. M., Farnung, L., Urlaub, H. & Cramer, P. Structure of paused transcription complex Pol II–DSIF–NELF. Nature 560, 601–606 (2018).
Su, B. G. & Vos, S. M. Distinct negative elongation factor conformations regulate RNA polymerase II promoter-proximal pausing. Mol. Cell 84, 1243–1256 (2024).
Werner, F. A nexus for gene expression—molecular mechanisms of Spt5 and NusG in the three domains of life. J. Mol. Biol. 417, 13–27 (2012).
Ponting, C. P. Novel domains and orthologues of eukaryotic transcription elongation factors. Nucleic Acids Res. 30, 3643–3652 (2002).
Nguyen, V. T., Kiss, T., Michels, A. A. & Bensaude, O. 7SK small nuclear RNA binds to and inhibits the activity of CDK9/cyclin T complexes. Nature 414, 322–325 (2001).
Marz, M. et al. Evolution of 7SK RNA and its protein partners in metazoa. Mol. Biol. Evol. 26, 2821–2830 (2009).
S, G. B., Gohil, D. S. & Roy Choudhury, S. Genome-wide identification, evolutionary and expression analysis of the cyclin-dependent kinase gene family in peanut. BMC Plant Biol. 23, 43 (2023).
Uehara, T. N. et al. Phosphorylation of RNA polymerase II by CDKC;2 maintains the Arabidopsis circadian clock period. Plant Cell Physiol. 63, 450–462 (2022).
Cao, L. et al. Phylogenetic analysis of CDK and cyclin proteins in premetazoan lineages. BMC Evol. Biol. 14, 10 (2014).
Gressel, S., Schwalb, B. & Cramer, P. The pause–initiation limit restricts transcription activation in human cells. Nat. Commun. 10, 3603 (2019).
Chou, S.-P., Alexander, A. K., Rice, E. J., Choate, L. A. & Danko, C. G. Genetic dissection of the RNA polymerase II transcription cycle. eLife 11, e78458 (2022).
Watts, J. A. et al. cis Elements that mediate RNA polymerase II pausing regulate human gene expression. Am. J. Hum. Genet. 105, 677–688 (2019).
Tome, J. M., Tippens, N. D. & Lis, J. T. Single-molecule nascent RNA sequencing identifies regulatory domain architecture at promoters and enhancers. Nat. Genet. 50, 1533–1541 (2018).
Hendrix, D. A., Hong, J.-W., Zeitlinger, J., Rokhsar, D. S. & Levine, M. S. Promoter elements associated with RNA Pol II stalling in the Drosophila embryo. Proc. Natl Acad. Sci. USA 105, 7762–7767 (2008).
Liston, D. R. & Johnson, P. J. Anweirdalysis of a ubiquitous promoter element in a primitive eukaryote: early evolution of the initiator element. Mol. Cell. Biol. 19, 2380 (1999).
Ngoc, L. V., Cassidy, C. J., Huang, C. Y., Duttke, S. H. C. & Kadonaga, J. T. The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters. Genes Dev. 31, 6 (2017).
Dang, W. et al. Inactivation of yeast Isw2 chromatin remodeling enzyme mimics longevity effect of calorie restriction via induction of genotoxic stress response. Cell Metab. 19, 952–966 (2014).
Persson, J. et al. Regulating retrotransposon activity through the use of alternative transcription start sites. EMBO Rep. 17, 753 (2016).
Gilchrist, D. A. et al. Pausing of RNA polymerase II disrupts DNA-specified nucleosome organization to enable precise gene regulation. Cell 143, 540–551 (2010).
Kundaje, A. et al. Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements. Genome Res. 22, 1735 (2012).
Lantermann, A. B. et al. Schizosaccharomyces pombe genome-wide nucleosome mapping reveals positioning mechanisms distinct from those of Saccharomyces cerevisiae. Nat. Struct. Mol. Biol. 17, 251–257 (2010).
Bintu, L. et al. Nucleosomal elements that control the topography of the barrier to transcription. Cell 151, 738–749 (2012).
DeBerardine, M., Booth, G. T., Versluis, P. P. & Lis, J. T. The NELF pausing checkpoint mediates the functional divergence of Cdk9. Nat. Commun. 14, 2762 (2023).
Fant, C. B. et al. TFIID enables RNA polymerase II promoter-proximal pausing. Mol. Cell 78, 785–793.e8 (2020).
Nabet, B. et al. The dTAG system for immediate and target-specific protein degradation. Nat. Chem. Biol. 14, 431–441 (2018).
Henriques, T. et al. Widespread transcriptional pausing and elongation control at enhancers. Genes Dev. 32, 26–41 (2018).
Aoi, Y. et al. NELF regulates a promoter-proximal step distinct from RNA Pol II pause-release. Mol. Cell 78, 261–274.e5 (2020).
Blümli, S. et al. Acute depletion of the ARID1A subunit of SWI/SNF complexes reveals distinct pathways for activation and repression of transcription. Cell Rep. 37, 109943 (2021).
Sun, F. et al. The Pol II preinitiation complex (PIC) influences Mediator binding but not promoter–enhancer looping. Genes Dev. 35, 1175–1189 (2021).
Nojima, T. et al. Mammalian NET-Seq reveals genome-wide nascent transcription coupled to RNA processing. Cell 161, 526–540 (2015).
Mahat, D. B., Salamanca, H. H., Duarte, F. M., Danko, C. G. & Lis, J. T. Mammalian heat shock response and mechanisms underlying its genome-wide transcriptional regulation. Mol. Cell 62, 63–78 (2016).
Adelman, K. et al. Immediate mediators of the inflammatory response are poised for gene activation through RNA polymerase II stalling. Proc. Natl Acad. Sci. USA 106, 18207–18212 (2009).
Rahl, P. B. et al. c-Myc regulates transcriptional pause release. Cell 141, 432–445 (2010).
Vihervaara, A. et al. Transcriptional response to stress is pre-wired by promoter and enhancer architecture. Nat. Commun. 8, 255 (2017).
Duarte, F. M. et al. Transcription factors GAF and HSF act at distinct regulatory steps to modulate stress-induced gene activation. Genes Dev. 30, 1731–1746 (2016).
Ghosh, S. K. B., Missra, A. & Gilmour, D. S. Negative elongation factor accelerates the rate at which heat shock genes are shut off by facilitating dissociation of heat shock factor. Mol. Cell. Biol. 31, 4232–4243 (2011).
Lis, J. T., Mason, P., Peng, J., Price, D. H. & Werner, J. P-TEFb kinase recruitment and function at heat shock loci. Genes Dev. 14, 792 (2000).
Peterlin, B. M. & Price, D. H. Controlling the elongation phase of transcription with P-TEFb. Mol. Cell 23, 297–305 (2006).
Chen, F., Gao, X. & Shilatifard, A. Stably paused genes revealed through inhibition of transcription initiation by the TFIIH inhibitor triptolide. Genes Dev. 29, 39–47 (2015).
Henriques, T. et al. Stable pausing by RNA polymerase II provides an opportunity to target and integrate regulatory signals. Mol. Cell 52, 517–528 (2013).
Versluis, P. et al. Live-cell imaging of RNA Pol II and elongation factors distinguishes competing mechanisms of transcription regulation. Mol. Cell 84, 2856–2869 (2024).
Diribarne, G. & Bensaude, O. 7SK RNA, a non-coding RNA regulating P-TEFb, a general transcription factor. RNA Biol. 6, 122–128 (2009).
Fujinaga, K., Huang, F. & Peterlin, B. M. P-TEFb: the master regulator of transcription elongation. Mol. Cell 83, 393–403 (2023).
C. Quaresma, A. J., Bugai, A. & Barboric, M. Cracking the control of RNA polymerase II elongation by 7SK snRNP and P-TEFb. Nucleic Acids Res. 44, 7527–7539 (2016).
Parfrey, L. W., Lahr, D. J. G., Knoll, A. H. & Katz, L. A. Estimating the timing of early eukaryotic diversification with multigene molecular clocks. Proc. Natl Acad. Sci. USA 108, 13624–13629 (2011).
Roger, A. J. & Hug, L. A. The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation. Philos. Trans. R. Soc. Lond. B Biol. Sci. 361, 1039–1054 (2006).
Mahat, D. B. et al. Base-pair-resolution genome-wide mapping of active RNA polymerases using precision nuclear run-on (PRO-seq). Nat. Protoc. 11, 1455–1476 (2016).
Arenas-Mena, C. et al. Identification and prediction of developmental enhancers in sea urchin embryos. BMC Genomics 22, 751 (2021).
Lewis, J. J. et al. The Dryas iulia genome supports multiple gains of a W chromosome from a B chromosome in butterflies. Genome Biol. Evol. 13, evab128 (2021).
Stefanik, D. J., Friedman, L. E. & Finnerty, J. R. Collecting, rearing, spawning and inducing regeneration of the starlet sea anemone, Nematostella vectensis. Nat. Protoc. 8, 916–923 (2013).
Ran, F. et al. Genome engineering using the CRISPR-Cas9 system. Nat. Protoc. 8, 2281–2308 (2013).
Smith, J. P., Dutta, A. B., Sathyan, K. M., Guertin, M. J. & Sheffield, N. C. PEPPRO: quality control and processing of nascent RNA profiling data. Genome Biol. 22, 155 (2021).
R Core Team. R: a language and environment for statistical computing. Available at https://www.r-project.org/ (2025).
Zeileis, A. & Grothendieck, G. zoo: S3 infrastructure for regular and irregular time series. J. Stat. Softw. 14, 1–27 (2005).
DeBerardine, M. BRGenomics: tools for the efficient analysis of high-resolution genomics data. Available at https://rdrr.io/bioc/BRGenomics/ (2022).
Lawrence, M., Gentleman, R. & Carey, V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25, 1841–1842 (2009).
Lee, S., Cook, D. & Lawrence, M. plyranges: a grammar of genomic data transformation. Genome Biol. 20, 4 (2019).
Pagès, H. BSgenome: software infrastructure for efficient representation of full genomes and their SNPs. Available at https://bioc.r-universe.dev/BSgenome (2022).
Wickham, H. et al. Welcome to the tidyverse. J. Open Source Softw. 4, 1686 (2019).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis 2nd edn (Springer, 2016).
Core, L. J. et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).
Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
Richter, D. J. et al. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Commun. J. 2, e56 (2022).
Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).
Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079–1086 (2020).
Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).
Li, W. & Godzik, A. CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Katoh, K., Misawa, K., Kuma, K.-I. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).
Katoh, K., Kuma, K.-I., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Finn, R. D., Clements, J. & Eddy, S. R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 39, W29–W37 (2011).
Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Pagès, H., Aboyoun, P., Gentleman, R. & DebRoy S. Biostrings: efficient manipulation of biological strings. Available at https://rdrr.io/bioc/Biostrings/ (2022).
Ivanek, O. B. A. seqLogo: sequence logos for DNA sequence alignments. Available at https://ivanek.github.io/seqLogo/articles/seqLogo.html (2023).
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127–128 (2007).
Adl, S. M. et al. Revisions to the classification, nomenclature, and diversity of eukaryotes. J. Eukaryot. Microbiol. 66, 4–119 (2019).
Ahlmann-Eltze, C. & Patil, I. ggsignif: R package for displaying significance brackets for ‘ggplot2’. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/7awm6 (2021).
Neuwirth, E. RColorBrewer: ColorBrewer palettes. Available at https://cran.r-project.org/web/packages/RColorBrewer/index.html (2022).
Wilke, C. O. cowplot: streamlined plot theme and plot annotations for ‘ggplot2’. Available at https://rdrr.io/cran/cowplot/ (2020).
Acknowledgements
We thank members of the C.G.D. and J.T.L. labs for valuable discussions and suggestions throughout the life of this project and M. A. Subirats for preparing samples from C. owczarzaki, C. fragrantissima and S. arctica. We acknowledge the Fundación Pública Galega Centro Tecnolóxico de Supercomputación de Galicia for access to the FinisTerraeIII supercomputer and V. Shabardina for facilitating access. Work in this publication was primarily supported by a grant from the National Aeronautics and Space Administration exobiology program (17-EXO-17-2-0112). Additional funding was also available from the National Human Genome Research Institute (R01-HG010346 and R01-HG009309) to C.G.D., the National Institute of General Medical Sciences (R01 GM147731) to I.L.B. and C.G.D., and the National Institutes of Health (NIH; RM1-GM139738) to J.T.L. A.A. was supported by the NIH (T32GM007739 and F30HD103398). M.M.L. was supported by an Ayuda Juan de la Cierva Incorporación postdoctoral fellowship (IJC2018-036657-I) from the Spanish Ministry of Science and Innovation. Work in A.K.H.’s lab was supported by the NIH (R01HD094868, R01DK127821, R01HD086478 and P30CA008748). Work in I.R.-T.’s lab was supported by a European Research Council Consolidator Grant (ERC-2012-Co-616960). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Some of the figures in this manuscript were created using BioRender.com.
Author information
Authors and Affiliations
Contributions
J.J.L., C.G.D. and A.G.C. designed the study. E.J.R., A.A. and G.C. performed the experimental research. M.M.L., A.G.C., W.W., J.J.L. and C.G.D. interpreted the protein sequence comparisons across the tree of life. B.A.B., A.G.C., G.B., W.W., J.J.L., S.P.C., J.T.L. and C.G.D. analyzed and interpreted the sequencing data. A.G.C., J.J.L., J.T.L. and C.G.D. wrote the manuscript. A.A., A.C.V., J.J.S., A.H.W., C.A.M., I.L.B., I.R.T., A.K.H. and R.B. collected the cells or provided the samples for experimental research. All authors were involved in revisions and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Structural & Molecular Biology thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Sara Osman, in collaboration with the Nature Structural & Molecular Biology team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 TSS reannotation.
PRO-seq (blue) and PRO-cap24 (red) metaprofiles for S. Pombe10 (a), S. cerevisiae10 (b), D. Melanogaster24 (c), and H. Sapiens98 (d). Upper panels use published gene annotations, lower panels use reannotated genes. Note the relative depletion of PRO-cap reads upstream and downstream of the TSS and more focused pause in PRO-seq signal of D. melanogaster and H. sapiens in re annotated panels.
Extended Data Fig. 2 Clustering species by their pausing index values.
Density maps of pausing indexes for each species. The plots are split into four quantiles and colored accordingly.
Extended Data Fig. 3 Association of NELF and HEXIM subunits with pausing.
Box and whiskers plots show pausing index values in each species. Samples are clustered by the presence (n = 12) or absence (n = 8) of any NELF subunits (a), and presence () or absence () of HEXIM subunits (b). A two-sided Mann-Whitney test was used to compute p-values. Boxplot whiskers extend from the first/third quartile to the minima/maxima, ignoring outliers, while horizontal lines denote the median.(c) Box and whiskers plot of pausing indexes in each species. Boxes are clustered by which NELF subunits are present in each species for which PRO-seq (n = 11), GRO-seq (n = 2), and ChRO-seq (n = 6), were used in this project. Boxplot whiskers extend from the first/third quartile to the minima/maxima, ignoring outliers, while horizontal lines denote the median.
Extended Data Fig. 4 Pause motif search.
(a) Box and whiskers plots depict enrichment of motif scores for the pause button in each species. Samples are clustered by the presence or absence of NELF subunits (None: n = 8, NELF-B or C/D: n = 3, All, or lacking NELF-E: n = 9). A two-sided Mann-Whitney test was used to compute p-values. Boxplot whiskers extend from the first/third quartile to the minima/maxima, ignoring outliers, while horizontal lines denote the median. (b) Metaprofile plot of reads mapping to maternal and paternal alleles for genes with stronger pause motif on the maternal (left) or paternal (right) alleles. (c) Scatter plot of enrichment motif score of Initiator sequence plotted against the mean pausing index per species (R = 0.105, p = 0.661). Each dot is colored by the number of NELF subunits found in each sample. We fit a linear regression to derive the R2 and the p-value. A 95% confidence interval around this regression line is shown. (d) Box and whiskers plots depict enrichment of motif scores for the initiator motif in each species. Samples are clustered by the presence or absence of NELF subunits (None: n = 8, NELF-B or C/D: n = 3, All, or lacking NELF-E: n = 9). A two-sided Mann-Whitney test was used to compute p-values. Boxplot whiskers extend from the first/third quartile to the minima/maxima, ignoring outliers, while horizontal lines denote the median.
Extended Data Fig. 5 nelfe-FKBP12 homozygous cell line generation.
a) Schematic of CRISPR design to add the FBBP12 tag at the nelfe locus. b) PCR validation of CRISPR insertion of the FKBP12 tag. c) Microscopy images evaluating the degradation efficiency before (top) and after a 30 min treatment with 500 nM dTAG-13 (bottom) in the edited and unedited cell lines. Hoechst was used as a nuclear control, while anti-HA antibodies measure the added tag, and anti-NELFE measures the NELF-E protein level. Arrows point out the presence of Feeder cells. d) Degradation efficiency of NELF-E as measured by western blotting. b-Actin was used as a loading control, while anti-HA measures the level of NELFE-HA protein. Input denotes the relative amount of total protein loaded.
Extended Data Fig. 6 nelfb and nelfe-FKBP12 homozygous cell line validation.
a) Western blot of whole cells following NELF-E degradation with 500 nM dTAG-13 for 0 to 24 h of treatment. b) Western blot validation of chromatin fraction vs nuclear soluble fractionation. c) Western blot of NELF-B and -E proteins after degradation of either protein for 1 h. Both nuclear-soluble and chromatin-bound proteins were analyzed. d) Quantification of western blot signal in (c) for NELF-E after degradation of NELF-B, and vice-versa (n = 3 biological replicates).
Extended Data Fig. 7 Effect of NELF-B and NELF-E degradation on Pol II distribution.
a) WashU browser shots at the Nanog gene locus before and after NELF-B and -E degradation. b) Heat maps of spike-in normalized PRO-seq signal. c) Heatmaps of log2 fold changes of normalized PRO-seq signal relative to untreated controls. All heat maps are centered on active TSSs in mESCs. d) Metaprofiles of cluster 3 genes at each dTAG time point, the recovery of pause-like behavior between 30 and 60 min, and the proto-pause observed in S. pombe for reference. The proto-pause region is shaded in the two rightmost panels.
Extended Data Fig. 8 Characterization of transcription recovery clusters after NELF-B degradation.
a) Bar plots depict the percentage of transcribed enhancers and gene promoters in each cluster defined in Fig. 5e (cluster 1: n = 8,030, cluster 2: n = 16,078, cluster 3: n = 3,792). b) Enrichment profiles of the pause motif published previously49 plotted in a 1 kb window centered on TSSs found in each of the clusters defined in Fig. 5e. c) Violin plots depict log10 transformed initiation (right) or pause release (left) rates in each cluster. A two-sided Mann-Whitney test was used to compute p-values, where n.s. defines non-significant p-values, and (***) p-values < 2.2e-16. d) Plots depict the enrichment of the TATA box, Initiator, MTE, and DPE sequence motifs in each cluster in Fig. 3e. e) Meta profiles depict the enrichment of TBP, TAF-12, TFIIA, TFIIB, H3K9ac, and Med1 per cluster. ChIP-seq data from ref.66.
Extended Data Fig. 9 Pol II trickles into gene bodies effect after NELF-B degradation.
Heatmaps of log2 fold changes in PRO-seq signal in NELF-B tagged cell lines at all TSSs, Clusters 1, 2, and 3 (in this order from top to bottom rows). The heatmaps depict log2 fold change relative for the following comparisons (from left to right columns): untreated PRO-seq signal, log2 fold change for 30 min/0 min, 60 min/0 min, and 60 min/30 min of dTAG-13 treatment.
Extended Data Fig. 10 Correlations between heat shock PRO-seq data.
a) Principal component analysis (PCA) of non-heat shock (NHS), dTAG-13 treatment (dTAG), heat shock (HS), and a pre-treatment of dTAG-13 followed by heat shock (HS+dTAG). b) WashU browser shots at the heat-triggered genes (Hist1h3b, Hsp1h1, Hist1h3b) in the NELF-B (a) and NELF-E (b) edited cell lines. c) Heatmaps of log2 fold changes in PRO-seq signal in NELF-B (left) and NELF-E tagged (right) cell lines. The heatmap rows depict fold changes relative to NHS for the following treatments: dTAG-13 treatment alone, HS alone, and dual treatment of dTAG-13 and HS.
Supplementary information
Supplementary Information
Supplementary Figs. 1–15.
Supplementary Data 1
Two subdirectories. ‘Westerns’ contains raw western blots in .scn format (visualize with Fiji/ImageJ). ‘Phylogenies’ contains alignments and phylogenies for relevant proteins.
Source data
Source Data Fig
. 1 Source Data Fig. 2 Sequence, domain and sequence alignment data (.fasta, .phy, .hmm and .pfam formats), phylogenetic trees (.iqtree, .contree and .treefile) and an extended description of the methods used to generate the presence/absence table in Fig. 1d. Sequence data and an extended description of methods used to assess NELF presence for Fig. 2d.
Source Data Fig. 4
Raw western blot scan files (.scn format), which can be opened and quantified in ImageJ.
Source Data Extended Data Fig. 1
Metadata including calculated pausing indices, protocol and number of NELF subunits present for species included in main figures.
Source Data Extended Data Fig. 2
Quantification of NELF-B upon degradation of NELF-E and vice versa.
Source Data Extended Data Fig. 3
Conservation of key proteins.
Source Data Extended Data Fig. 4
Metadata, sequencing quality control, methods notes and accession numbers.
Source Data Extended Data Fig. 5
Raw western blot scan files (.scn format), which can be opened and quantified in ImageJ.
Source Data Extended Data Fig. 6
Raw western blot scan files (.scn format), which can be opened and quantified in ImageJ.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chivu, A.G., Basso, B.A., Abuhashem, A. et al. Evolution of promoter-proximal pausing enabled a new layer of transcription control. Nat Struct Mol Biol (2025). https://doi.org/10.1038/s41594-025-01718-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41594-025-01718-y


