Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Evolution of promoter-proximal pausing enabled a new layer of transcription control

Abstract

Promoter-proximal pausing of RNA polymerase (Pol) II is a key regulatory step during transcription. Despite the central role of pausing in gene regulation, we do not understand the evolutionary processes that led to the emergence of Pol II pausing or its transition to a rate-limiting step actively controlled by transcription factors. Here, we analyzed transcription in species across the tree of life. Unicellular eukaryotes display an accumulation of Pol II near transcription start sites, which we propose transitioned to the longer-lived, focused pause observed in metazoans. This transition coincided with the evolution of new subunits in the negative elongation factor (NELF) and 7SK complexes. Depletion of NELF in mammals shifted the promoter-proximal buildup of Pol II from the pause site into the early gene body and compromised transcriptional activation for a set of heat-shock genes. Our work details the evolutionary history of Pol II pausing and sheds light on how new transcriptional regulatory mechanisms evolve.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The evolution of NELF subunits is associated with pausing.
Fig. 2: Genomic features are associated with pausing.
Fig. 3: Nucleosome positioning is associated with pausing.
Fig. 4: NELF degradation destabilized RNA Pol II pausing.
Fig. 5: NELF degradation leads to different pause recovery profiles.
Fig. 6: Removing paused Pol II prevents activation of genes by HSF1 after heat-shock stimulation.

Similar content being viewed by others

Data availability

Tables in CSV format can be downloaded from GitHub (https://github.com/alexachivu/PauseEvolution_prj). Data generated in this study were deposited to the Gene Expression Omnibus under accession number GSE223913. Source data are provided with this paper.

Code availability

Custom code for analyzing sequencing data can be found on GitHub (https://github.com/alexachivu/PauseEvolution_prj/).

References

  1. Scholes, C., DePace, A. H. & Sánchez, Á Combinatorial gene regulation through kinetic control of the transcription cycle. Cell Syst. 4, 97–108.e9 (2017).

    Article  CAS  PubMed  Google Scholar 

  2. Mikhailov, K. V. et al. The origin of Metazoa: a transition from temporal to spatial cell differentiation. Bioessays 31, 758–768 (2009).

    Article  CAS  PubMed  Google Scholar 

  3. Arenas-Mena, C. Indirect development, transdifferentiation and the macroregulatory evolution of metazoans. Philos. Trans. R. Soc. Lond. B Biol. Sci. 365, 653–669 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Arenas-Mena, C. The origins of developmental gene regulation. Evol. Dev. 19, 96–107 (2017).

    Article  PubMed  Google Scholar 

  5. Core, L. & Adelman, K. Promoter-proximal pausing of RNA polymerase II: a nexus of gene regulation. Genes Dev. 33, 960–982 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Adelman, K. & Lis, J. T. Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nat. Rev. Genet. 13, 720–731 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Shen, X.-X. et al. Tempo and mode of genome evolution in the budding yeast subphylum. Cell 175, 1533–1545 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Nagy, L. G. et al. Latent homology and convergent regulatory evolution underlies the repeated emergence of yeasts. Nat. Commun. 5, 1–8 (2014).

    Article  Google Scholar 

  9. Vill, A. C., Rice, E. J., De Vlaminck, I., Danko, C. G. & Brito, I. L. Precision run-on sequencing (PRO-seq) for microbiome transcriptomics. Nat. Microbiol. 9, 241–250 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Booth, G. T., Wang, I. X., Cheung, V. G. & Lis, J. T. Divergence of a conserved elongation factor and transcription regulation in budding and fission yeast. Genome Res. 26, 799–811 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Rougvie, A. E. & Lis, J. T. The RNA polymerase II molecule at the 5′ end of the uninduced hsp70 gene of D. melanogaster is transcriptionally engaged. Cell 54, 795–804 (1988).

    Article  CAS  PubMed  Google Scholar 

  12. Muse, G. W. et al. RNA polymerase is poised for activation across the genome. Nat. Genet. 39, 1507–1511 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Jonkers, I., Kwak, H. & Lis, J. T. Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons. eLife 3, e02407 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Danko, C. G. et al. Signaling pathways differentially affect RNA polymerase II initiation, pausing, and elongation rate in cells. Mol. Cell 50, 212–222 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Zeitlinger, J. et al. RNA polymerase stalling at developmental control genes in the Drosophila melanogaster embryo. Nat. Genet. 39, 1512–1516 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Abuhashem, A. et al. RNA Pol II pausing facilitates phased pluripotency transitions by buffering transcription. Genes Dev. 36, 770–789 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Williams, L. H. et al. Pausing of RNA polymerase II regulates mammalian developmental potential through control of signaling networks. Mol. Cell 58, 311–322 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Buckley, M. S., Kwak, H., Zipfel, W. R. & Lis, J. T. Kinetics of promoter Pol II on Hsp70 reveal stable pausing and key insights into its regulation. Genes Dev. 28, 14–19 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Aoi, Y. et al. SPT6 functions in transcriptional pause/release via PAF1C recruitment. Mol. Cell 82, 3412–3423 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Marshall, N. F. & Price, D. H. Purification of P-TEFb, a transcription factor required for the transition into productive elongation. J. Biol. Chem. 270, 12335–12338 (1995).

    Article  CAS  PubMed  Google Scholar 

  22. Li, Q. et al. Analysis of the large inactive P-TEFb complex indicates that it contains one 7SK molecule, a dimer of HEXIM1 or HEXIM2, and two P-TEFb molecules containing Cdk9 phosphorylated at threonine 186. J. Biol. Chem. 280, 28819–28826 (2005).

    Article  CAS  PubMed  Google Scholar 

  23. Booth, G. T., Parua, P. K., Sansó, M., Fisher, R. P. & Lis, J. T. Cdk9 regulates a promoter-proximal checkpoint to modulate RNA polymerase II elongation rate in fission yeast. Nat. Commun. 9, 543 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Kwak, H., Fuda, N. J., Core, L. J. & Lis, J. T. Precise maps of RNA polymerase reveal how promoters direct initiation and pausing. Science 339, 950–953 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Kruesi, W. S., Core, L. J., Waters, C. T., Lis, J. T. & Meyer, B. J. Condensin controls recruitment of RNA polymerase II to achieve nematode X-chromosome dosage compensation. eLife 2, e00808 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Hetzel, J., Duttke, S. H., Benner, C. & Chory, J. Nascent RNA sequencing reveals distinct features in plant transcription. Proc. Natl Acad. Sci. USA 113, 12316–12321 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Lozano, R. et al. RNA polymerase mapping in plants identifies intergenic regulatory elements enriched in causal variants. G3 (Bethesda) 11, jkab273 (2021).

    Article  CAS  PubMed  Google Scholar 

  28. Goliasse, M. et al. Uncovering the multi-layer cis-regulatory landscape of rice via integrative nascent RNA analysis. Genome Biol. 26, 250 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wang, Z. et al. Prediction of histone post-translational modification patterns based on nascent transcription data. Nat. Genet. 54, 295–305 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Zhao, Y., Liu, L., Hassett, R. & Siepel, A. Model-based characterization of the equilibrium dynamics of transcription initiation and promoter-proximal pausing in human cells. Nucleic Acids Res. 51, gkad843 (2023).

    Article  Google Scholar 

  31. Chu, T. et al. Chromatin run-on and sequencing maps the transcriptional regulatory landscape of glioblastoma multiforme. Nat. Genet. 50, 1553–1564 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Lee, C. et al. NELF and GAGA factor are linked to promoter-proximal pausing at many genes in Drosophila. Mol. Cell. Biol. 28, 3290–3300 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Ni, Z. et al. P-TEFb is critical for the maturation of RNA polymerase II into productive elongation in vivo. Mol. Cell. Biol. 28, 1161–1170 (2008).

    Article  CAS  PubMed  Google Scholar 

  34. Wu, C.-H. et al. NELF and DSIF cause promoter proximal pausing on the hsp70 promoter in Drosophila. Genes Dev. 17, 1402–1414 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Vos, S. M. et al. Architecture and RNA binding of the human negative elongation factor. eLife 5, e14981 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Narita, T. et al. Human transcription elongation factor NELF: identification of novel subunits and reconstitution of the functionally active complex. Mol. Cell. Biol. 23, 1863–1873 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Gilchrist, D. A. et al. NELF-mediated stalling of Pol II can enhance gene expression by blocking promoter-proximal nucleosome assembly. Genes Dev. 22, 1921–1933 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Vos, S. M., Farnung, L., Urlaub, H. & Cramer, P. Structure of paused transcription complex Pol II–DSIF–NELF. Nature 560, 601–606 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Su, B. G. & Vos, S. M. Distinct negative elongation factor conformations regulate RNA polymerase II promoter-proximal pausing. Mol. Cell 84, 1243–1256 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Werner, F. A nexus for gene expression—molecular mechanisms of Spt5 and NusG in the three domains of life. J. Mol. Biol. 417, 13–27 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Ponting, C. P. Novel domains and orthologues of eukaryotic transcription elongation factors. Nucleic Acids Res. 30, 3643–3652 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Nguyen, V. T., Kiss, T., Michels, A. A. & Bensaude, O. 7SK small nuclear RNA binds to and inhibits the activity of CDK9/cyclin T complexes. Nature 414, 322–325 (2001).

    Article  CAS  PubMed  Google Scholar 

  43. Marz, M. et al. Evolution of 7SK RNA and its protein partners in metazoa. Mol. Biol. Evol. 26, 2821–2830 (2009).

    Article  CAS  PubMed  Google Scholar 

  44. S, G. B., Gohil, D. S. & Roy Choudhury, S. Genome-wide identification, evolutionary and expression analysis of the cyclin-dependent kinase gene family in peanut. BMC Plant Biol. 23, 43 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Uehara, T. N. et al. Phosphorylation of RNA polymerase II by CDKC;2 maintains the Arabidopsis circadian clock period. Plant Cell Physiol. 63, 450–462 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Cao, L. et al. Phylogenetic analysis of CDK and cyclin proteins in premetazoan lineages. BMC Evol. Biol. 14, 10 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Gressel, S., Schwalb, B. & Cramer, P. The pause–initiation limit restricts transcription activation in human cells. Nat. Commun. 10, 3603 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Chou, S.-P., Alexander, A. K., Rice, E. J., Choate, L. A. & Danko, C. G. Genetic dissection of the RNA polymerase II transcription cycle. eLife 11, e78458 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Watts, J. A. et al. cis Elements that mediate RNA polymerase II pausing regulate human gene expression. Am. J. Hum. Genet. 105, 677–688 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Tome, J. M., Tippens, N. D. & Lis, J. T. Single-molecule nascent RNA sequencing identifies regulatory domain architecture at promoters and enhancers. Nat. Genet. 50, 1533–1541 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Hendrix, D. A., Hong, J.-W., Zeitlinger, J., Rokhsar, D. S. & Levine, M. S. Promoter elements associated with RNA Pol II stalling in the Drosophila embryo. Proc. Natl Acad. Sci. USA 105, 7762–7767 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Liston, D. R. & Johnson, P. J. Anweirdalysis of a ubiquitous promoter element in a primitive eukaryote: early evolution of the initiator element. Mol. Cell. Biol. 19, 2380 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Ngoc, L. V., Cassidy, C. J., Huang, C. Y., Duttke, S. H. C. & Kadonaga, J. T. The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters. Genes Dev. 31, 6 (2017).

    Article  Google Scholar 

  54. Dang, W. et al. Inactivation of yeast Isw2 chromatin remodeling enzyme mimics longevity effect of calorie restriction via induction of genotoxic stress response. Cell Metab. 19, 952–966 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Persson, J. et al. Regulating retrotransposon activity through the use of alternative transcription start sites. EMBO Rep. 17, 753 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Gilchrist, D. A. et al. Pausing of RNA polymerase II disrupts DNA-specified nucleosome organization to enable precise gene regulation. Cell 143, 540–551 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Kundaje, A. et al. Ubiquitous heterogeneity and asymmetry of the chromatin environment at regulatory elements. Genome Res. 22, 1735 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Lantermann, A. B. et al. Schizosaccharomyces pombe genome-wide nucleosome mapping reveals positioning mechanisms distinct from those of Saccharomyces cerevisiae. Nat. Struct. Mol. Biol. 17, 251–257 (2010).

    Article  CAS  PubMed  Google Scholar 

  59. Bintu, L. et al. Nucleosomal elements that control the topography of the barrier to transcription. Cell 151, 738–749 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. DeBerardine, M., Booth, G. T., Versluis, P. P. & Lis, J. T. The NELF pausing checkpoint mediates the functional divergence of Cdk9. Nat. Commun. 14, 2762 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Fant, C. B. et al. TFIID enables RNA polymerase II promoter-proximal pausing. Mol. Cell 78, 785–793.e8 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Nabet, B. et al. The dTAG system for immediate and target-specific protein degradation. Nat. Chem. Biol. 14, 431–441 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Henriques, T. et al. Widespread transcriptional pausing and elongation control at enhancers. Genes Dev. 32, 26–41 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Aoi, Y. et al. NELF regulates a promoter-proximal step distinct from RNA Pol II pause-release. Mol. Cell 78, 261–274.e5 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Blümli, S. et al. Acute depletion of the ARID1A subunit of SWI/SNF complexes reveals distinct pathways for activation and repression of transcription. Cell Rep. 37, 109943 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Sun, F. et al. The Pol II preinitiation complex (PIC) influences Mediator binding but not promoter–enhancer looping. Genes Dev. 35, 1175–1189 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Nojima, T. et al. Mammalian NET-Seq reveals genome-wide nascent transcription coupled to RNA processing. Cell 161, 526–540 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Mahat, D. B., Salamanca, H. H., Duarte, F. M., Danko, C. G. & Lis, J. T. Mammalian heat shock response and mechanisms underlying its genome-wide transcriptional regulation. Mol. Cell 62, 63–78 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Adelman, K. et al. Immediate mediators of the inflammatory response are poised for gene activation through RNA polymerase II stalling. Proc. Natl Acad. Sci. USA 106, 18207–18212 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Rahl, P. B. et al. c-Myc regulates transcriptional pause release. Cell 141, 432–445 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Vihervaara, A. et al. Transcriptional response to stress is pre-wired by promoter and enhancer architecture. Nat. Commun. 8, 255 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Duarte, F. M. et al. Transcription factors GAF and HSF act at distinct regulatory steps to modulate stress-induced gene activation. Genes Dev. 30, 1731–1746 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Ghosh, S. K. B., Missra, A. & Gilmour, D. S. Negative elongation factor accelerates the rate at which heat shock genes are shut off by facilitating dissociation of heat shock factor. Mol. Cell. Biol. 31, 4232–4243 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Lis, J. T., Mason, P., Peng, J., Price, D. H. & Werner, J. P-TEFb kinase recruitment and function at heat shock loci. Genes Dev. 14, 792 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Peterlin, B. M. & Price, D. H. Controlling the elongation phase of transcription with P-TEFb. Mol. Cell 23, 297–305 (2006).

    Article  CAS  PubMed  Google Scholar 

  76. Chen, F., Gao, X. & Shilatifard, A. Stably paused genes revealed through inhibition of transcription initiation by the TFIIH inhibitor triptolide. Genes Dev. 29, 39–47 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Henriques, T. et al. Stable pausing by RNA polymerase II provides an opportunity to target and integrate regulatory signals. Mol. Cell 52, 517–528 (2013).

    Article  CAS  PubMed  Google Scholar 

  78. Versluis, P. et al. Live-cell imaging of RNA Pol II and elongation factors distinguishes competing mechanisms of transcription regulation. Mol. Cell 84, 2856–2869 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Diribarne, G. & Bensaude, O. 7SK RNA, a non-coding RNA regulating P-TEFb, a general transcription factor. RNA Biol. 6, 122–128 (2009).

    Article  CAS  PubMed  Google Scholar 

  80. Fujinaga, K., Huang, F. & Peterlin, B. M. P-TEFb: the master regulator of transcription elongation. Mol. Cell 83, 393–403 (2023).

    Article  CAS  PubMed  Google Scholar 

  81. C. Quaresma, A. J., Bugai, A. & Barboric, M. Cracking the control of RNA polymerase II elongation by 7SK snRNP and P-TEFb. Nucleic Acids Res. 44, 7527–7539 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  82. Parfrey, L. W., Lahr, D. J. G., Knoll, A. H. & Katz, L. A. Estimating the timing of early eukaryotic diversification with multigene molecular clocks. Proc. Natl Acad. Sci. USA 108, 13624–13629 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Roger, A. J. & Hug, L. A. The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation. Philos. Trans. R. Soc. Lond. B Biol. Sci. 361, 1039–1054 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Mahat, D. B. et al. Base-pair-resolution genome-wide mapping of active RNA polymerases using precision nuclear run-on (PRO-seq). Nat. Protoc. 11, 1455–1476 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  85. Arenas-Mena, C. et al. Identification and prediction of developmental enhancers in sea urchin embryos. BMC Genomics 22, 751 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Lewis, J. J. et al. The Dryas iulia genome supports multiple gains of a W chromosome from a B chromosome in butterflies. Genome Biol. Evol. 13, evab128 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Stefanik, D. J., Friedman, L. E. & Finnerty, J. R. Collecting, rearing, spawning and inducing regeneration of the starlet sea anemone, Nematostella vectensis. Nat. Protoc. 8, 916–923 (2013).

    Article  PubMed  Google Scholar 

  88. Ran, F. et al. Genome engineering using the CRISPR-Cas9 system. Nat. Protoc. 8, 2281–2308 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Smith, J. P., Dutta, A. B., Sathyan, K. M., Guertin, M. J. & Sheffield, N. C. PEPPRO: quality control and processing of nascent RNA profiling data. Genome Biol. 22, 155 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. R Core Team. R: a language and environment for statistical computing. Available at https://www.r-project.org/ (2025).

  91. Zeileis, A. & Grothendieck, G. zoo: S3 infrastructure for regular and irregular time series. J. Stat. Softw. 14, 1–27 (2005).

    Article  Google Scholar 

  92. DeBerardine, M. BRGenomics: tools for the efficient analysis of high-resolution genomics data. Available at https://rdrr.io/bioc/BRGenomics/ (2022).

  93. Lawrence, M., Gentleman, R. & Carey, V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25, 1841–1842 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Lee, S., Cook, D. & Lawrence, M. plyranges: a grammar of genomic data transformation. Genome Biol. 20, 4 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  95. Pagès, H. BSgenome: software infrastructure for efficient representation of full genomes and their SNPs. Available at https://bioc.r-universe.dev/BSgenome (2022).

  96. Wickham, H. et al. Welcome to the tidyverse. J. Open Source Softw. 4, 1686 (2019).

    Article  Google Scholar 

  97. Wickham, H. ggplot2: Elegant Graphics for Data Analysis 2nd edn (Springer, 2016).

  98. Core, L. J. et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  100. Richter, D. J. et al. EukProt: a database of genome-scale predicted proteins across the diversity of eukaryotes. Peer Commun. J. 2, e56 (2022).

    Article  Google Scholar 

  101. Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).

    Article  CAS  PubMed  Google Scholar 

  102. Parks, D. H. et al. A complete domain-to-species taxonomy for Bacteria and Archaea. Nat. Biotechnol. 38, 1079–1086 (2020).

    Article  CAS  PubMed  Google Scholar 

  103. Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).

    Article  CAS  PubMed  Google Scholar 

  104. Li, W. & Godzik, A. CD-HIT: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).

    Article  CAS  PubMed  Google Scholar 

  105. Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Katoh, K., Misawa, K., Kuma, K.-I. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Katoh, K., Kuma, K.-I., Toh, H. & Miyata, T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Finn, R. D., Clements, J. & Eddy, S. R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 39, W29–W37 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  111. Pagès, H., Aboyoun, P., Gentleman, R. & DebRoy S. Biostrings: efficient manipulation of biological strings. Available at https://rdrr.io/bioc/Biostrings/ (2022).

  112. Ivanek, O. B. A. seqLogo: sequence logos for DNA sequence alignments. Available at https://ivanek.github.io/seqLogo/articles/seqLogo.html (2023).

  113. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127–128 (2007).

    Article  CAS  PubMed  Google Scholar 

  114. Adl, S. M. et al. Revisions to the classification, nomenclature, and diversity of eukaryotes. J. Eukaryot. Microbiol. 66, 4–119 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  115. Ahlmann-Eltze, C. & Patil, I. ggsignif: R package for displaying significance brackets for ‘ggplot2’. Preprint at PsyArXiv https://doi.org/10.31234/osf.io/7awm6 (2021).

  116. Neuwirth, E. RColorBrewer: ColorBrewer palettes. Available at https://cran.r-project.org/web/packages/RColorBrewer/index.html (2022).

  117. Wilke, C. O. cowplot: streamlined plot theme and plot annotations for ‘ggplot2’. Available at https://rdrr.io/cran/cowplot/ (2020).

Download references

Acknowledgements

We thank members of the C.G.D. and J.T.L. labs for valuable discussions and suggestions throughout the life of this project and M. A. Subirats for preparing samples from C. owczarzaki, C. fragrantissima and S. arctica. We acknowledge the Fundación Pública Galega Centro Tecnolóxico de Supercomputación de Galicia for access to the FinisTerraeIII supercomputer and V. Shabardina for facilitating access. Work in this publication was primarily supported by a grant from the National Aeronautics and Space Administration exobiology program (17-EXO-17-2-0112). Additional funding was also available from the National Human Genome Research Institute (R01-HG010346 and R01-HG009309) to C.G.D., the National Institute of General Medical Sciences (R01 GM147731) to I.L.B. and C.G.D., and the National Institutes of Health (NIH; RM1-GM139738) to J.T.L. A.A. was supported by the NIH (T32GM007739 and F30HD103398). M.M.L. was supported by an Ayuda Juan de la Cierva Incorporación postdoctoral fellowship (IJC2018-036657-I) from the Spanish Ministry of Science and Innovation. Work in A.K.H.’s lab was supported by the NIH (R01HD094868, R01DK127821, R01HD086478 and P30CA008748). Work in I.R.-T.’s lab was supported by a European Research Council Consolidator Grant (ERC-2012-Co-616960). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Some of the figures in this manuscript were created using BioRender.com.

Author information

Authors and Affiliations

Authors

Contributions

J.J.L., C.G.D. and A.G.C. designed the study. E.J.R., A.A. and G.C. performed the experimental research. M.M.L., A.G.C., W.W., J.J.L. and C.G.D. interpreted the protein sequence comparisons across the tree of life. B.A.B., A.G.C., G.B., W.W., J.J.L., S.P.C., J.T.L. and C.G.D. analyzed and interpreted the sequencing data. A.G.C., J.J.L., J.T.L. and C.G.D. wrote the manuscript. A.A., A.C.V., J.J.S., A.H.W., C.A.M., I.L.B., I.R.T., A.K.H. and R.B. collected the cells or provided the samples for experimental research. All authors were involved in revisions and approved the final manuscript.

Corresponding authors

Correspondence to John T. Lis, James J. Lewis or Charles G. Danko.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Structural & Molecular Biology thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Sara Osman, in collaboration with the Nature Structural & Molecular Biology team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 TSS reannotation.

PRO-seq (blue) and PRO-cap24 (red) metaprofiles for S. Pombe10 (a), S. cerevisiae10 (b), D. Melanogaster24 (c), and H. Sapiens98 (d). Upper panels use published gene annotations, lower panels use reannotated genes. Note the relative depletion of PRO-cap reads upstream and downstream of the TSS and more focused pause in PRO-seq signal of D. melanogaster and H. sapiens in re annotated panels.

Source data

Extended Data Fig. 2 Clustering species by their pausing index values.

Density maps of pausing indexes for each species. The plots are split into four quantiles and colored accordingly.

Source data

Extended Data Fig. 3 Association of NELF and HEXIM subunits with pausing.

Box and whiskers plots show pausing index values in each species. Samples are clustered by the presence (n = 12) or absence (n = 8) of any NELF subunits (a), and presence () or absence () of HEXIM subunits (b). A two-sided Mann-Whitney test was used to compute p-values. Boxplot whiskers extend from the first/third quartile to the minima/maxima, ignoring outliers, while horizontal lines denote the median.(c) Box and whiskers plot of pausing indexes in each species. Boxes are clustered by which NELF subunits are present in each species for which PRO-seq (n = 11), GRO-seq (n = 2), and ChRO-seq (n = 6), were used in this project. Boxplot whiskers extend from the first/third quartile to the minima/maxima, ignoring outliers, while horizontal lines denote the median.

Source data

Extended Data Fig. 4 Pause motif search.

(a) Box and whiskers plots depict enrichment of motif scores for the pause button in each species. Samples are clustered by the presence or absence of NELF subunits (None: n = 8, NELF-B or C/D: n = 3, All, or lacking NELF-E: n = 9). A two-sided Mann-Whitney test was used to compute p-values. Boxplot whiskers extend from the first/third quartile to the minima/maxima, ignoring outliers, while horizontal lines denote the median. (b) Metaprofile plot of reads mapping to maternal and paternal alleles for genes with stronger pause motif on the maternal (left) or paternal (right) alleles. (c) Scatter plot of enrichment motif score of Initiator sequence plotted against the mean pausing index per species (R = 0.105, p = 0.661). Each dot is colored by the number of NELF subunits found in each sample. We fit a linear regression to derive the R2 and the p-value. A 95% confidence interval around this regression line is shown. (d) Box and whiskers plots depict enrichment of motif scores for the initiator motif in each species. Samples are clustered by the presence or absence of NELF subunits (None: n = 8, NELF-B or C/D: n = 3, All, or lacking NELF-E: n = 9). A two-sided Mann-Whitney test was used to compute p-values. Boxplot whiskers extend from the first/third quartile to the minima/maxima, ignoring outliers, while horizontal lines denote the median.

Source data

Extended Data Fig. 5 nelfe-FKBP12 homozygous cell line generation.

a) Schematic of CRISPR design to add the FBBP12 tag at the nelfe locus. b) PCR validation of CRISPR insertion of the FKBP12 tag. c) Microscopy images evaluating the degradation efficiency before (top) and after a 30 min treatment with 500 nM dTAG-13 (bottom) in the edited and unedited cell lines. Hoechst was used as a nuclear control, while anti-HA antibodies measure the added tag, and anti-NELFE measures the NELF-E protein level. Arrows point out the presence of Feeder cells. d) Degradation efficiency of NELF-E as measured by western blotting. b-Actin was used as a loading control, while anti-HA measures the level of NELFE-HA protein. Input denotes the relative amount of total protein loaded.

Source data

Extended Data Fig. 6 nelfb and nelfe-FKBP12 homozygous cell line validation.

a) Western blot of whole cells following NELF-E degradation with 500 nM dTAG-13 for 0 to 24 h of treatment. b) Western blot validation of chromatin fraction vs nuclear soluble fractionation. c) Western blot of NELF-B and -E proteins after degradation of either protein for 1 h. Both nuclear-soluble and chromatin-bound proteins were analyzed. d) Quantification of western blot signal in (c) for NELF-E after degradation of NELF-B, and vice-versa (n = 3 biological replicates).

Source data

Extended Data Fig. 7 Effect of NELF-B and NELF-E degradation on Pol II distribution.

a) WashU browser shots at the Nanog gene locus before and after NELF-B and -E degradation. b) Heat maps of spike-in normalized PRO-seq signal. c) Heatmaps of log2 fold changes of normalized PRO-seq signal relative to untreated controls. All heat maps are centered on active TSSs in mESCs. d) Metaprofiles of cluster 3 genes at each dTAG time point, the recovery of pause-like behavior between 30 and 60 min, and the proto-pause observed in S. pombe for reference. The proto-pause region is shaded in the two rightmost panels.

Extended Data Fig. 8 Characterization of transcription recovery clusters after NELF-B degradation.

a) Bar plots depict the percentage of transcribed enhancers and gene promoters in each cluster defined in Fig. 5e (cluster 1: n = 8,030, cluster 2: n = 16,078, cluster 3: n = 3,792). b) Enrichment profiles of the pause motif published previously49 plotted in a 1 kb window centered on TSSs found in each of the clusters defined in Fig. 5e. c) Violin plots depict log10 transformed initiation (right) or pause release (left) rates in each cluster. A two-sided Mann-Whitney test was used to compute p-values, where n.s. defines non-significant p-values, and (***) p-values < 2.2e-16. d) Plots depict the enrichment of the TATA box, Initiator, MTE, and DPE sequence motifs in each cluster in Fig. 3e. e) Meta profiles depict the enrichment of TBP, TAF-12, TFIIA, TFIIB, H3K9ac, and Med1 per cluster. ChIP-seq data from ref.66.

Extended Data Fig. 9 Pol II trickles into gene bodies effect after NELF-B degradation.

Heatmaps of log2 fold changes in PRO-seq signal in NELF-B tagged cell lines at all TSSs, Clusters 1, 2, and 3 (in this order from top to bottom rows). The heatmaps depict log2 fold change relative for the following comparisons (from left to right columns): untreated PRO-seq signal, log2 fold change for 30 min/0 min, 60 min/0 min, and 60 min/30 min of dTAG-13 treatment.

Extended Data Fig. 10 Correlations between heat shock PRO-seq data.

a) Principal component analysis (PCA) of non-heat shock (NHS), dTAG-13 treatment (dTAG), heat shock (HS), and a pre-treatment of dTAG-13 followed by heat shock (HS+dTAG). b) WashU browser shots at the heat-triggered genes (Hist1h3b, Hsp1h1, Hist1h3b) in the NELF-B (a) and NELF-E (b) edited cell lines. c) Heatmaps of log2 fold changes in PRO-seq signal in NELF-B (left) and NELF-E tagged (right) cell lines. The heatmap rows depict fold changes relative to NHS for the following treatments: dTAG-13 treatment alone, HS alone, and dual treatment of dTAG-13 and HS.

Supplementary information

Supplementary Information

Supplementary Figs. 1–15.

Reporting Summary

Supplementary Data 1

Two subdirectories. ‘Westerns’ contains raw western blots in .scn format (visualize with Fiji/ImageJ). ‘Phylogenies’ contains alignments and phylogenies for relevant proteins.

Source data

Source Data Fig

. 1 Source Data Fig. 2 Sequence, domain and sequence alignment data (.fasta, .phy, .hmm and .pfam formats), phylogenetic trees (.iqtree, .contree and .treefile) and an extended description of the methods used to generate the presence/absence table in Fig. 1d. Sequence data and an extended description of methods used to assess NELF presence for Fig. 2d.

Source Data Fig. 4

Raw western blot scan files (.scn format), which can be opened and quantified in ImageJ.

Source Data Extended Data Fig. 1

Metadata including calculated pausing indices, protocol and number of NELF subunits present for species included in main figures.

Source Data Extended Data Fig. 2

Quantification of NELF-B upon degradation of NELF-E and vice versa.

Source Data Extended Data Fig. 3

Conservation of key proteins.

Source Data Extended Data Fig. 4

Metadata, sequencing quality control, methods notes and accession numbers.

Source Data Extended Data Fig. 5

Raw western blot scan files (.scn format), which can be opened and quantified in ImageJ.

Source Data Extended Data Fig. 6

Raw western blot scan files (.scn format), which can be opened and quantified in ImageJ.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chivu, A.G., Basso, B.A., Abuhashem, A. et al. Evolution of promoter-proximal pausing enabled a new layer of transcription control. Nat Struct Mol Biol (2025). https://doi.org/10.1038/s41594-025-01718-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s41594-025-01718-y

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing