Introduction

Replication stress (RS) is a major hallmark of cancer cells and a source of genomic instability1. During tumor progression, mutations in oncogenes and tumor suppressor genes can induce RS by imposing a state of high and deregulated DNA replication, which increases the chances for fork collapse and generation of DNA breaks2. In this scenario, fork collapse has been linked to multiple causes, including the depletion of nucleotides, mis-incorporation of ribonucleotides into nascent DNA, inefficient activation of protective fork processing pathways, and conflicts arising from the encounter of replication and transcription complexes3,4,5.

RS is considered to have a dual role in cancer evolution: on one hand, it can act as a driving factor by providing a mutation-prone background that favors further selection of driver mutations; on the other, it can trigger tumor suppressive responses such as apoptosis or cellular senescence6. In line with the latter, the exacerbation of RS in cancer cells can offer therapeutic opportunities, either in combinatorial treatments, to enhance the effectiveness of current chemotherapeutic drugs, or in targeted therapies exploiting synthetic-lethal dependencies in genetically defined tumors7,8,9. Thus, identifying factors and processes that regulate RS in cancer cells is a subject of active studies and has high translational potential.

The c-MYC proto-oncogene is frequently overexpressed in cancer cells, due to genomic alterations (e.g., translocation, amplification) or oncogenic mutations in upstream regulatory pathways10,11. Its product, MYC, is a transcription factor belonging to the bHLH-LZ family. It dimerizes with another bHLH-LZ protein, MAX, to bind DNA in a sequence-specific manner and to control the transcription of genes linked to cell proliferation and metabolism11. Oncogenic levels of MYC lead to increased binding to proximal and distal regulatory elements within the genome, a phenomenon dubbed “enhancer-promoter invasion”, that pairs with broadened transcriptional regulation12,13,14,15. The essentiality of MYC in cancer cells and its temporal dispensability in normal tissues16, makes it an attractive and potentially universal therapeutic target17, whether through direct inhibition of MYC activity18, or through exploitation of MYC-induced dependences18.

While still under investigation, evidence shows that MYC can trigger RS by promoting the processivity of DNA synthesis, increasing origin firing, anticipating S-phase entry and possibly by inducing transcription-replication conflicts (TRCs)19,20,21. Several lines of evidence indicate that MYC-overexpressing cells are addicted to genetic dependencies that curb RS to avoid rampant genomic instability and cell death. These include the ATR/CHK1 kinases22,23,24,25, fork remodelers and regulators26,27,28, and factors preventing transcriptional interference25,29,30,31,32,33. Altogether, these pathways may cooperate to prevent the collapse and/or excessive processing of stalled replication forks.

To identify factors regulating genomic and genetic stability in MYC overexpressing cells we conducted a loss of function genetic screen. Among other genes, we identified CDK12 as being selectively required to prevent RS and cytotoxic DDR. CDK12 is a member of the cyclin dependent kinase family that associates with and is activated by Cyclin K (CCNK)34. Like other transcriptional CDKs, CDK12 - and its paralogue CDK13 - can phosphorylate and activate RNA polymerase II (RNAPII)35. In addition, CDK12 regulates the expression and processing of genes linked to cell cycle progression and DNA damage signaling and repair36,37,38. In line with this, inhibition of CDK12 sensitizes cancer cells to DNA-damaging agents34. CDK12 is also recurrently mutated in several cancers, and an actionable therapeutic target34,39.

Here, we report that CDK12 is recruited onto transcribed and damaged genes to repress transcription. This depends on PARP, engaged by DDR signaling, and the elongation factor SPT5. Inhibition or silencing of CDK12 unleashes transcription at damaged genes and exacerbates TRCs leading to DSBs between early replicating regions and the promoters of genes transcribed co-directionally to the replication fork. Our work unveils a new role for CDK12 in controlling transcription and genomic stability and reveals how management of TRCs is a major liability in MYC-driven tumors.

Results

Identification of genes controlling genome stability and viability in MYC-overexpressing cells

To identify modulators of genomic stability and cell viability in MYC-overexpressing cells, we devised a high-content siRNA screen in Rosa26-MycER mouse embryo fibroblasts, where conditional activation of MycER mimics the oncogenic properties of c-MYC27. These cells were generated from homozygous knock-in embryos carrying a gene encoding a MYC-estrogen receptor chimera (MycER) that can be activated by 4-hydroxytamoxifen (OHT)40. We screened this cell line with a custom siRNA library (DDR-library) targeting 1196 genes previously implicated in genome integrity41, and a commercial library (druggable-library) targeting 1400 druggable genes. For the screen, cells were retro-transfected in 384-well plates, with each well containing a siRNA mix targeting one gene. Forty-eight hours post-transfection, cells were analyzed by immunofluorescence for a parallel quantification of DDR signaling (based on the integration of single-cell γH2AX intensities) and cell growth (by cell counting), both in mock (no OHT) and MycER-activated cells (plus OHT).

For each gene, we calculated the average normalized differential values (Z-score) for both DDR and cell growth (Fig. 1a, Supplementary Fig. 1a, b, d, e and Supplementary dataset 1) and identified as hits those genes that, when silenced, would preferentially affect DDR (DDR hits) and/or cell growth (viability hits). We scored a total of 629 genes: 386 DDR-hits and 335 viability-hits; of these 92 were both DDR and viability hits (Fig. 1b and Supplementary Fig. 1c, f). A subset of the primary hits was cross-validated in a secondary screen (Supplementary Fig. 1b, c, f and Supplementary dataset 1). Gene ontology and protein-protein interaction network analyses with Metascape42, revealed a prevalence of genes involved in RNA processing, translation, protein degradation, cell cycle control and signaling (Fig. 1c, Supplementary Fig. 2 and Supplementary dataset 2). This confirmed previously identified MYC-dependencies, such as its reliance on transcriptional regulators43, the splicing machinery44,45, regulators of mitosis46,47, nucleotide biosynthesis48,49, and proteasome activity50.

Fig. 1: CDK12 is synthetic lethal with Myc activation.
figure 1

a Dot plot of the normalized differential DDR score (y axis, Z-score for DDR) and the normalized differential viability (x axis, Z-score for viability) for each siRNA of the library targeting druggable genes (primary screen). A positive Z-score for DDR indicates higher DDR upon MycER activation compared to mock activated U2OS-MycER cells, while a negative viability Z-score indicates a reduction of viability compared to mock activated cells. Dashed lines indicate thresholds used to call the DDR and viability hits. b Venn diagram of all the hits identified in the two primary screens as DDR- and viability-hits. c Metascape enrichment network of the hits of the two primary screens (DDR and druggable genes). d Cell death in U2OS-MycER cells upon MycER activation (+OHT, 400 nM) and CDK12 knockdown by the indicated shRNAs. e, f WB analysis of DDR markers in U2OS-MycER cells upon MycER activation (+OHT, 400 nM) and (e) CDK12 knockdown or (f) CDK12 inhibition by THZ531. g WB analysis of DDR markers following treatment of multiple myeloma (KMS-11, OPM1) and Burkitt’s lymphoma (Raji, BL28) cell lines with THZ531 for 48 hours. Vinculin (VCL) was used as loading control. Lanes 1, 2, and 3 show independent replicates.

Among the hits confirmed in the secondary screen (Supplementary Fig. 1b), we identified CDK12, a kinase implicated in transcriptional control and genome stability, which has recently drawn attention as a potential therapeutic target in cancer34,39. Most noteworthy, CDK12 also scored in a previous list for MYC-synthetic lethal candidates, but without further characterization47. In addition, re-analysis of a recently published screen51 revealed that CDK12, its paralogue CDK13, and its regulatory CCNK are synthetic lethal with MYC amplification in medulloblastoma (Supplementary Fig. 3).

Genetic or chemical targeting of CDK12 impairs cell viability and triggers a DNA damage response in MYC-overexpressing cells

For further validation and mechanistic analysis, we engineered U2OS cells expressing the MycER chimera27 (U2OS-MycER cells) with doxycycline-inducible shRNAs targeting CDK12 (shCDK12 #1 and #2) or a non-targeting shRNA (shNT) (Supplementary Fig. 4a). CDK12 knock-down (CDK12-KD) impaired cell growth, with a stronger inhibition when MycER was activated (Supplementary Fig. 4b, c). On the other hand, cell viability was reduced only when cells experienced both CDK12-silencing and MycER activation (Fig. 1d), indicating a strong synthetic-lethal phenotype. This was paralleled by enhanced DNA damage response (DDR), engaging both ATM and ATR signaling, as indicated by WB analysis (Fig. 1e). Synthetic lethality and synthetic DDR activation were also confirmed by siRNA mediated silencing (Supplementary Fig. 5) and upon CDK12 inhibition by THZ531 or dinaciclib (Fig. 1f and Supplementary Fig. 6). Similar to CDK12, silencing of CDK13 or CCNK lead to synthetic lethality and DDR upon MycER activation (Supplementary Fig. 7). Next, we evaluated whether CDK12 inhibition would induce DDR in Burkitt’s lymphoma and multiple myeloma cell lines in which expression of MYC is deregulated by chromosomal translocations. Most of the cell lines tested were sensitive to THZ531, with IC50 values in the nanomolar range (Supplementary Fig. 8a). Similar to what was observed in U2OS-MycER cells, THZ531 induced the activation of a DDR characterized by a recurrent increase of phospho-KAP1 and γH2AX signals (Fig. 1g). The only exception was the RAMOS cell line, which had the highest IC50 for THZ531 and did not display an increase in phospho-KAP1 or γH2AX following CDK12 inhibition, but instead showed increased phosphorylation of CHK1 (Supplementary Fig. 9). Other DDR markers had inconsistent fluctuations when these cell lines were treated with THZ531 (Supplementary Fig. 9), possibly suggesting that the engagement of select DDR pathways may depend on the genetic background or other context-specific properties of the cancer cells. In all the cell lines tested CDK12 inhibition also led to an altered cell cycle distribution characterized by a decrease in BrdU-labeled cells, indicating reduced DNA synthesis and altered cell cycle progression (Supplementary Fig. 8b).

Genome-wide transcriptional alterations upon CDK12 silencing

CDK12 has been implicated in transcriptional regulation as it can phosphorylate RNAPII on serine 2 of the carboxyl-terminal repeat domain (CTD), thus stimulating elongation35. It was also reported to prevent early transcriptional termination at cryptic poly-A sites and to regulate alternative splicing37,38,52. Thus, we assessed how CDK12 silencing would affect the transcriptome in our cells.

RNA-seq analysis showed that CDK12-KD led to both up- and downregulation of distinct sets of genes (DEG-up and -down, Supplementary Fig. 10a). These alterations in mRNA expression were coherent with phospho-Ser-2 RNAPII ChIP-seq data, with consistent increases and decreases in the elongating form of RNAPII on DEG-up and DEG-down genes, respectively (Supplementary Fig. 10b), while the global level of Ser2-Pi RNAPII was not affected by CDK12 silencing (Supplementary Fig. 10c), confirming previous observations35,37,38. Most noticeably, DEG-down genes tended to be more expressed in unchallenged cells and had a shorter size, while the opposite was true for DEG-up genes (Supplementary Fig. 10d, e and Supplementary dataset 3). We also evaluated whether CDK12 silencing would broadly alter the expression of DDR genes. The top 20 GO enrichment terms of DEG did not include terms associated with genome stability or DNA repair (Supplementary Fig. 10f, g and Supplementary dataset 3). The analysis of a custom collection of DDR-associated genes did not reveal global repression of these genes (Supplementary Fig. 11a-c and Supplementary dataset 3). Exon level analysis of the mRNA processing of long-DDR genes failed to show any consistent or overt alterations in splicing or premature termination (Supplementary Fig. 11d and 12). Overall, the lack of pervasive regulation of DDR genes following CDK12 silencing suggests that the increased DDR observed upon MYC activation in CDK12-KD cells may not be solely ascribed to altered expression or processing of some DDR genes. In addition, CDK12-KD did not alter MYC-dependent transcription, since MYC targets were still efficiently activated when CDK12 was silenced (Supplementary Fig. 13 and Supplementary dataset 3).

CDK12 represses RNA synthesis at damaged genes

Given the reported role of CDK12 in regulating RNAPII activity, we asked whether silencing of CDK12 might induce a global change in RNA synthesis rate, not necessarily detected by steady state expression analyses (RNA-seq). To this end, we pulsed labeled cells and evaluated RNA synthesis by quantitative immunofluorescence (IF) analysis of 5-ethynyluridine (EU) incorporation. We assessed RNA synthesis in unchallenged cells or following DNA damaging radiations (UV or IR), which are known to repress transcription. Silencing of CDK12 led to an increase in EU incorporation suggesting a raise in global RNA synthesis or, alternatively, an imbalance between genes repressed and genes activated by CDK12 (i.e. DEG-up genes would contribute more to global RNA synthesis than DEG-down). Surprisingly, while IR or UV irradiation caused a global decrease in RNA synthesis, the increase caused by CDK12-KD persisted in irradiated cells, as shown in multiple cell lines (Fig. 2a, b and Supplementary Fig. 14a,c). Inhibition of CDK12 by THZ531 yielded similar results (Supplementary Fig. 14b). As increased RNA synthesis caused by CDK12 depletion persisted in IR- or UV-irradiated cells, we hypothesized that damaged genes may not be repressed in these conditions. A role for CDK12 in the repression of transcription at damaged loci would have important implications, as this DNA-damage-induced transcriptional shutdown is critical for preserving genomic integrity53. To further address this, we took advantage of the U2OS-TRE-I-SceI-19 reporter cell line that allows the introduction of DNA double-strand breaks (DSBs) upstream of the promoter of a Tet-regulated MS2-based transcriptional reporter54. This system enables simultaneous locus-specific detection of nascent RNA and DSBs, at the single-cell level. Upon activation of the mCherry-tTA-ER transcription factor, the transcribed locus can be visualized by the colocalization of mCherry-tTA-ER and the nascent RNA, which is bound by the MS2-YFP RNA binding protein. At the same time, DSBs generated by the expression of the I-SceI restriction enzyme can be visualized by γH2AX IF-staining. As expected, upon expression of I-SceI, DSBs were generated at the locus (visualized by the colocalization of γH2AX foci with mCherry-tTA-ER) (Fig. 2d), thus leading to loss of nascent RNA (i.e. loss of the nuclear MS2-YFP dot, Fig. 2d). Indeed, less than 10% of mock silenced cells (siLuc) showed an MS2-YFP signal colocalized with γH2AX (Fig. 2e), thus confirming that upon DNA damage the activity of the transcriptional reporter was strongly suppressed. Instead, upon CDK12 silencing, transcription was rescued in 50% of the cells showing a damaged site (Fig. 2d, e). A similar rescue was observed upon inhibition of CDK12 by THZ531 (Fig. 2f, g). In both cases, loss of CDK12 activity does not affect the efficiency of DSBs induced by I-SceI (Supplementary Fig. 15a, b). In the absence of I-SceI, mCherry-tTA-ER and MS2-YFP colocalized and were not affected by CDK12 silencing or inhibition (Fig. 2d, e, f, g). Altogether, these data imply that CDK12 activity is required for locus-specific transcriptional repression following DNA damage. Silencing of either CCNK or CDK13, also led to a partial rescue of transcription at the damaged TRE-I-SceI-19 reporter, while simultaneous silencing of CDK12, CDK13 and CCNK further increased transcriptional rescue of the locus, suggesting partial redundancy (Supplementary Fig. 16a,b).

Fig. 2: CDK12 silencing or inhibition rescues transcription at DNA-damaged genes.
figure 2

a, b Bee swarm plots of single-cell nascent-RNA synthesis (by EU incorporation) in mock or UV-irradiated cells. Cells were pulsed with 0.5 M EU for 20 min. before collection. a U2OS cells: siLuc n = 1965 cells; siCDK12#1: n = 2320 cells; siCDK12#2: n = 3446 cells; siLc+UV: n = 2363 cells; siCDK12#1 + UV: n = 5003 cells; siCDK12#2 + UV: n = 2806 cells. b RPE1 cells: siLuc n = 4100 cells; siCDK12#1 n = 4090 cells; siCDK12#2: n = 4482 cells; siLc+UV: n = 5503 cells; siCDK12#1 + UV: n = 2897 cells; siCDK12#2 + UV: n = 4057 cells. ****: P value < 0.0001 (Kruskal Wallis test with multiple comparisons). c WB analysis. VCL is a loading control. d, e Representative IF-images (d) and bar plot (e) of the colocalization of mCherry-tTA-ER and YFP-MS2 signals upon silencing of CDK12 in U2OS-TRE-I-SceI-19 cells. Where indicated, cells were transfected with I-SceI (+SceI) to induce DSBs on the TRE-MS2 reporter. Average of two independent experiments. siRLuc: n = 58 cells, n = 109 cells; siCDK12#1: n = 97 cells, n = 117 cells; siCDK12#2: n = 48 cells, n = 94 cells; siRLuc+SceI: n = 64 cells, n = 109 cells; siCDK12#1+SceI: n = 70 cells, n = 102 cells; siCDK12#2+SceI: n = 67 cells, n = 76 cells. The error bar is the standard deviation. f, g As in d, e, but upon inhibition of CDK12. Average of two independent experiments. DMSO: n = 159 cells, n = 124 cells; THZ531: n = 123 cells, n = 93 cells; DMSO+SceI: n = 113 cells, n = 87 cells; THZ531+SceI: n = 111 cells, n = 96 cells. The error bar is the standard deviation. Source data are provided as a Source Data file.

CDK12 is recruited to damaged and transcribed genes

Next, we asked whether CDK12 could localize to sites of DNA damage. Cells were irradiated and proximity ligation assay (PLA) was used to assess the proximity of CDK12 to γH2AX foci. While signals were negligible in unirradiated cells, there was a marked increase in PLA-signals (foci) 30 minutes post-irradiation (Fig. 3a, b); by 4 hours the signal had returned to baseline level, possibly coinciding with ongoing DNA repair (Fig. 3a, b). The same result was confirmed with a different CDK12 antibody (Supplementary Fig. 17). Silencing of CDK12 lowered the PLA signal in irradiated cells, confirming the selectivity of the assay (Fig. 3a, b and Supplementary Fig. 17). Hence, CDK12 can be detected in close proximity to DDR-foci. To assess whether DNA damage and/or the resulting DDR would trigger the recruitment of CDK12, we transiently expressed an mCherry-tagged CDK12 and performed live imaging analyses on laser micro-irradiated cells. In unchallenged cells, CDK12 showed a diffused/dotted nuclear signal, while upon irradiation CDK12 promptly localized at damaged sites (Fig. 3c, Supplementary Fig. 18a, b and Supplementary Video 1). Similarly, mCherry-CDK12 was recruited onto the reporter locus only when this was cut by the I-SceI nuclease (Fig. 3d and Supplementary Fig. 19a).

Fig. 3: CDK12 is recruited on DNA damaged sites proximal to transcribed genes.
figure 3

a, b Analysis of proximity of CDK12 and γH2AX foci by PLA in U2OS cells upon CDK12 silencing (shCDK12) or mock (shNT, non-targeting). a Representative pictures of mock silenced cells (shNT). shNT: mock n = 284 cells; 30’ n = 317 cells; 4 h n = 301 cells. shCDK12: mock n = 283 cells; 30’ n = 303 cells; 4 h n = 310 cells. b Box plot of PLA signals (nuclear dots) at different times post-irradiation (10 Gy). The box extends from the 25th to 75th percentiles, the error bar indicates minimum and maximum. P values calculated by Mann-Whitney test. c Kinetics of recruitment of mCherry-CDK12 to laser-damaged sites in U2OS cells treated or not with 100 µM DRB or 1 µM triptolide (TPL). Left, representative pictures; the red boxes highlight the irradiated areas. Right, the time-series plot shows the average and the standard deviation. DMSO: n = 5 cells; DRB: n = 13 cells; TPL: n = 12 cells. ****: P-value < 0.0001 (Two way anova with sidak’s multiple comparisons test). d Colocalization of mCherry-CDK12 and γH2AX foci at SceI-induced DSBs in U2OS-TRE-I-SceI-19 cells treated or not with 100 µM DRB or 1 µM triptolide (TPL). Left, pictures of mCherry-CDK12 and γH2AX signals. Right, box plots of two independent experiments. MOCK n = 43 cells, n = 60 cells; DRB n = 52 cells, n = 61 cells; TPL n = 50 cells, n = 57 cells. e, f IF analysis of locus specific double strand breaks induced by CRISPR/Cas9 editing at select loci. The error bar is the standard deviation. e Representative images. f Bar plot reporting the average colocalization of CDK12 and γH2AX foci in DRB treated cells (average of two experiments). DMSO POLR1B −1kb: n = 28 cells, n = 31 cells; DRB POLR1B −1kb: n = 37 cells, n = 26 cells; DMSO POLR1B −2kb: n = 23 cells, n = 32 cells; DRB POLR1B: −2kb n = 29 cells, n = 30 cells; DMSO IFRD2: −1kb n = 27 cells, n = 29 cells; DRB IFRD2 −1kb: n = 28 cells, n = 25 cells; DMSO IFRD2 −2kb: n = 23 cells, n = 25 cells; DRB IFRD2 −2kb: n = 12 cells, n = 27 cells; DMSO MCM2 −1kb: n = 24 cells, n = 35 cells; DRB MCM2 −1kb: n = 21 cells, n = 28 cells; DMSO MCM2 −2kb: n = 23 cells, n = 27 cells; DRB MCM2 −2kb: n = 18 cells, n = 30 cells; DMSO DISTAL#1: n = 19 cells, n = 30 cells; DRB DISTAL#1: n = 23 cells, n = 24 cells. The error bar is the standard deviation. Source data are provided as a Source Data file.

Further analysis with chemical inhibitors revealed two important features of the CDK12 relocalization. First, CDK12 inhibition by THZ531 did not prevent its recruitment to laser-damaged sites, thus indicating that this was independent from CDK12 catalytic activity (Supplementary Fig. 18c), yet this required CCNK (Supplementary Fig. 16c). Second, the recruitment of CDK12 onto damaged sites in both microirradiated and I-SceI transfected cells was suppressed by the transcriptional inhibitors 5,6-Dichloro-1-beta-D-ribofuranosylbenzimidazole (DRB) or triptolide, indicating that it depended on RNAPII transcriptional activity (Fig. 3c, d). Similarly, lack of activation of rtTA and the consequent recruitment of the transcriptional machinery prevented the localization of CDK12 on the damaged reporter in I-SceI transfected cells (Supplementary fig. 19b). To verify CDK12 recruitment on endogenous DNA damaged loci, we took advantage of the CRISPR/Cas9 genome editing system and designed sgRNAs to generate DSBs either at the promoter of expressed genes (MCM2, IFDR2 and POLR1B) or at non-transcribed distal regions. While sgRNA- and Cas9-mediated induction of DSBs proximal to promoters induced two γH2AX foci that co-localized with CDK12, the latter was absent at DSBs induced at the non-transcribed distal sites (Fig. 3e, f). At transcribed loci, CDK12 localization depended on the activity of RNAPII, since it was reduced to background level by DRB (Fig. 3f). Also, CDK12 recruitment was low and near background levels when DSBs were induced by sgRNAs targeting the transcriptional termination sites of transcribed genes (Supplementary Fig. 19c). Overall, these results suggest that CDK12 is recruited to DSBs that occur proximal to the promoter of transcribed loci or at gene bodies.

PARP, but not ATM- or ATR-dependent DDR, controls the recruitment of CDK12 to damaged genes

Next, we asked which components of the DDR would be required for the recruitment of CDK12 to DNA-damaged sites. Chemical inhibitors of either ATM or ATR did not prevent mCherry-CDK12 localization to laser-irradiated DNA (Fig. 4a, b). In addition, recruitment of CDK12 was preserved in H2AX-KO cells (Fig. 4c and Supplementary Fig. 20a,b). On the other hand, inhibition of PARP, which has also been implicated in the control of DDR and transcription at damaged sites55, prevented CDK12 recruitment to DDR-sites (Fig. 4d, e). PARP1 recruitment to DDR sites was independent of either ATR or ATM activity, suggesting that PARP1 and ATM/ATR may control DDR signaling independently (Supplementary Fig. 20c). We next evaluated whether ATM, PARP or CDK12 inhibition would affect RNA synthesis at the damaged MS2-reporter in I-SceI cells. Inhibition of either PARP or CDK12 rescued transcription, while their combined inhibition led to a marginal increase of the percentage of cells showing transcriptional rescue, consistent with the evidence that PARP and CDK12 are on the same signaling axis (Fig. 4f). As previously reported56, ATM inhibition rescued transcription (Fig. 4f). Combination of ATM and CDK12 inhibition did not significantly increase the rescue of transcription, compared to the single inhibitors (Fig. 4f). This suggests that while CDK12 recruitment to damaged sites is ATM-independent, the ATM and PARP/CDK12 signaling may converge downstream to control transcription at damaged sites.

Fig. 4: PARP-dependent DDR-signaling is necessary for CDK12 recruitment to DNA-damaged sites.
figure 4

a, b Recruitment of mCherry-CDK12 to laser-damaged sites in U2OS cells treated with ATMi (10 µM KU-5593) or ATRi (10 µM VE-821). a Irradiated areas are highlighted by a white frame. DMSO: n = 14 cells; ATMi: n = 17 cells; ATRi: n = 13 cells. The graph shows the average and the standard deviation. c Recruitment of mCherry-CDK12 to laser-damaged sites in H2AX knock-out cells. U2OS wt: n = 18 cells; U2OS KO: n = 26 cells. The graph shows the average and the standard deviation. d, e Recruitment of mCherry-CDK12 to laser-damaged sites in U2OS cells treated with 1 µM olaparib or 10 µM veliparib. d Irradiated areas are highlighted by a red frame. DMSO: n = 6 cells; veliparib: n = 27 cells; olaparib: n = 25 cells. The graph shows the average and the standard deviation. ****: P value < 0.0001 (Two way anova with sidak’s multiple comparisons test). f Nascent RNA analysis at the DNA-damaged TRE-MS2 reporter locus of U2OS-TRE-I-SceI-19 cells treated with the indicated compounds. DMSO: n = 124 cells, n = 130 cells; THZ531: 4 h n = 144 cells, n = 130 cells; THZ531: 24 h n = 150 cells, n = 146 cells; ATMi: 24 n = 132 cells, n = 115 cells; ATMi 4 h + THZ531: 24 h n = 129 cells, n = 101 cells; PARPi: n = 146 cells, n = 167 cells; PARPi 4 h + THZ531: 24 h n = 126 cells, n = 154 cells. The graph shows the average and the standard deviation. Source data are provided as a Source Data file.

CDK12 prevents the recruitment of CDK9 to damaged genes

Our experiments showed that CDK12 recruitment to damaged promoter-proximal regions depended upon the recruitment of active transcriptional complexes (tTA-ER) and the presence of a transcribing RNA polymerase. This possibly implies that on these loci CDK12 may repress the activity of the elongating RNAPII. Given the prominent role of CDK9 in elongation, we tested whether CDK12 might modulate CDK9 recruitment to damaged DNA. Coherently with transcriptional inhibition, CDK9 did not localize at laser-irradiated DNA in mock-silenced cells. Instead, silencing of CDK12 allowed the recruitment of CDK9 on damaged DNA (Fig. 5a, Supplementary Fig. 21a and Supplementary Video 4, 5). Similarly, I-SceI-induced DSBs reduced the co-localization of CDK9 and mCherry-tTA-ER; conversely, silencing of CDK12 engendered CDK9 positive spots proximal to DDR foci (γH2AX) and transcriptional foci (mCherry-tTA-ER) (Fig. 5b, c and Supplementary Fig. 21b). This suggested that CDK12 blocks the transcription of damaged genes by preventing CDK9 recruitment.

Fig. 5: CDK12 is recruited by SPT5 to prevent CDK9 loading to DNA-damaged sites.
figure 5

a Silencing of CDK12 allows the recruitment of EGFP-CDK9 to laser-damaged DNA. siRLuc: n = 23 cells; siCDK12#1: n = 21 cells. The time-series graph shows the average and the standard deviation.****: P value < 0.0001 (Two way anova with sidak’s multiple comparisons test). b, c Colocalization of mCherry-tTA-ER and EGFP-CDK9 at the promoter of the MS2 reporter in U2OS-TRE-I-SceI-19 cells, at either un-damaged promoters (mock) or I-SceI cut promoters (+SceI). Loss of CDK12 expression enhances localization of CDK9 at the DNA damaged reporter. b Quantification of two independent experiments. siRLuc: n = 70 cells, n = 150 cells; siCDK12#1: n = 83 cells, n = 122 cells; siCDK12#2: n = 67 cells, n = 122 cells; siRLuc+SceI: n = 136 cells, n = 166 cells; siCDK12#1+SceI: n = 90 cells, n = 107 cells; siCDK12#2+SceI: n = 103 cells, n = 148 cells. The graph shows the average and the standard deviation. d Recruitment of mCherry-CDK12 at laser-damaged DNA in NELFc silenced. Silencing the NELF complex (NELFc) does not affect CDK12 recruitment to laser-damaged DNA. siRLuc: n = 13 cells; siNELF complex: n = 18 cells. The graph shows the average and the standard deviation. e NELF-E recruitment at laser-damaged DNA in CDK12 silenced cells. siRLuc: n = 7 cells; siCDK12#1: n = 10 cells; siCDK12#2: n = 18 cells. The graph shows the average and the standard deviation. f Recruitment of mCherry-CDK12 at laser-damaged DNA in SPT5 silenced cells. siSPT5 prevents the recruitment of CDK12 to laser-damaged loci. siRLuc: n = 18 cells; siSPT5#1: n = 21 cells; siSPT5#2: n = 19 cells. The graph shows the average and the standard deviation. ****: P value < 0.0001 (Two way anova with sidak’s multiple comparisons test). g, h Loss of mCherry-CDK12 localization at the I-SceI cut reporter locus of U2OS-TRE-I-SceI-19 cells, following SPT5 silencing. g Representative images. h Bar plot of the fraction of cells showing colocalization of mCherry-CDK12 and γH2AX. siRLuc: n = 123 cells, n = 135 cells; siSPT5#1: n = 126 cells, n = 136 cells; siSPT5#2: n = 118 cells, n = 119 cells. The graph shows the average and the standard deviation. Source data are provided as a Source Data file.

Two major complexes, NELFc and DSIF, regulate pause-release and elongation by controlling CDK9 recruitment to RNAPII. The NELF-E subunit of NELFc can repress transcription at damaged promoters57, thus raising the question of whether NELFc may mediate CDK12 association to damaged loci. However, silencing of NELF-E or other NELFc subunits did not affect CDK12 recruitment on micro-irradiated nuclear regions (Fig. 5d and Supplementary Fig. 22a-c). Moreover, silencing of CDK12 did not prevent NELF-E recruitment, thus indicating that both proteins are recruited independently to DSBs (Fig. 5e). We next addressed the DSIF complex, a heterodimer composed of SPT4/5. While silencing SPT4 did not alter CDK12 recruitment to DSBs (Supplementary Fig. 22d,e), knock-down of SPT5 prevented it (Fig. 5f and Supplementary Video 6). This was also confirmed in the MS2-reporter line, where silencing of SPT5 decreased the colocalization of CDK12 with DDR foci (Fig. 5g, h). Overall, these results indicate that upon DNA breaks proximal to promoter regions, CDK12 is recruited by the DSIF complex, thus preventing CDK9 loading and processive elongation by RNAPII.

Loss of CDK12 exacerbates replicative stress and transcription-replication conflicts in MYC-overexpressing cells

Replicative stress induced by MYC has been associated with premature initiation of DNA synthesis and the collision of the replicative apparatus with loci transcribed during early S-phase19,21. For this reason, we evaluated whether silencing of CDK12 would exacerbate MYC-induced RS. Cell cycle analysis of asynchronous cultures showed that silencing of CDK12 in MYC-overexpressing cells led to an increase in the S-phase fraction (Supplementary Fig. 4c), which, associated with the increased DDR (Fig. 1e), was suggestive of RS. This was confirmed in experiments where cells were released from a mitotic block to allow their synchronous initiation of DNA replication. While MYC stimulated S-phase entry, silencing of CDK12 alone had little effect on cell cycle progression (Fig. 6a). On the other hand, the combination of MYC activation and CDK12 silencing delayed progression through S-phase, as evidenced by EdU incorporation profiles that were still predominantly composed of early S-phase cells (i.e., EdU positive cells with DNA content closer to 2 N), both at 12 and 18 hours post-release (Fig. 6a and Supplementary Fig. 23a). Overall, this suggested that loss of CDK12 increased MYC-induced RS. This was supported by quantitative analysis of DNA replication foci by PLA: combined MYC activation and CDK12 loss increased the number of PCNA/EdU foci, indicating an increased compensatory firing of replication origins, which is typically observed upon RS (Supplementary Fig. 23b, c). Consistent with the onset of RS, the increased γH2AX observed upon loss of CDK12 and MYC activation was associated with newly replicated DNA (Fig. 6b, c). Given the predominant role of transcription-replication conflicts (TRCs) as a potential source of DSBs positioned near transcribed genes, and considering that our data implicates CDK12 in the control of transcription at damaged genes, we asked whether loss of CDK12 would promote the formation of RNA-DNA hybrids. In particular, we reasoned that the loss of CDK12 might increase the chances of forming RNA-DNA hybrids (and R-loops), due to the lack of transcription inhibition at TRC loci and increased annealing of the nascent RNA on to the template DNA. Indeed, in cells depleted of CDK12, we did observe a robust formation of RNA-DNA hybrids following laser-induced DNA-damage, assessed by enhanced recruitment of RNaseH1-GFP (Fig. 6d, e). For a quantitative assessment of TRCs induced by MYC activation, we employed PLA for the detection of RNA-DNA hybrids (detected by the S9.6 antibody) and newly synthesized DNA (EdU-labeled). Knockdown of CDK12 marginally increased the number of S9.6/EdU foci, while MYC activation increased the formation of S9.6/EdU foci (Fig. 6f, g), as expected given its ability to induce TRCs19. Importantly, combined activation of MYC and silencing of CDK12 led to a stronger increase of S9.6/EdU foci, thus indicating that loss of CDK12 may indeed exacerbate TRCs (Fig. 6f, g). While these data suggest increased TRCs, whether or not these RNA-DNA hybrids are causal and or linked to increased R-loop formation will need further assessment. Enhanced Myc-induced TRCs were also confirmed by PLA analysis of PCNA and RNAPII, which showed increased signals upon CDK12 silencing (Supplementary Fig. 23e, f). In line with previous reports33, activation of Myc led to decreased PCNA/RNAPII signals, whose significance will need further investigation. The evidence that the silencing of CDK12 triggered MYC-induced TRCs suggested a protective role of CDK12, which might be recruited to TRCs loci, once DNA is damaged. In line with this hypothesis, PLA experiments showed that MycER activation enhanced colocalization of CDK12 with DDR foci (Fig. 6h and Supplementary Fig. 23d).

Fig. 6: Loss of CDK12 enhances MYC-induced replicative stress.
figure 6

a Cell cycle entry (12 hours post-mitotic release) of U2OS-MycER cells analyzed by FACS. b, c PLA to evaluate proximity of γH2AX and nascent DNA (EdU-labeled) upon MycER activation (+OHT) and CDK12 silencing. b Representative images. c Bee swarm plot. shNT: n = 437 cells (mock), n = 597 cells (+OHT), shCdk12#1: n = 617 cells (mock), n = 567 cells (+OHT), shCDK12#2: n = 438 cells (mock), n = 438 cells (+OHT). **: P value < 0.01, ***: P-value < 0.001(Kruskal Wallis test with multiple comparisons). d, e Live microscopy analysis of RNAseH1 recruitment to micro-irradiated DNA in U2OS cells silenced for CDK12. d Time-series graph shows the average and the standard deviation. ****: P value < 0.0001 (Two way anova with sidak’s multiple comparisons test). siRLuc n = 10 cells; siCDK12#1 n = 12 cells; siCDK12#2 n = 12 cells. e Representative images. f, g PLA to evaluate the proximity of RNA-DNA hybrids, stained with the S9.6 antibody and nascent DNA (EdU labeled). f Representative images (g) bee swarm plot. shNT: n = 211 cells (mock), 640 cells (+OHT), shCdk12#1: n = 404 cells (mock), n = 445 cells (+OHT), shCDK12#2: n = 538 cells (mock), n = 429 cells (+OHT). ****: P-value < 0.0001 (Kruskal Wallis test with multiple comparisons). h Bee swarm plot of PLA of CDK12 and γH2AX in U2OS-MycER cells. Where indicated (Cdk12-OE), cells were transfected with a plasmid encoding CDK12. EV: n = 534 cells (mock), n = 326 cells (+OHT), Cdk12-OE: n = 410 cells (mock), n = 333 cells (+OHT). ****: P-value < 0.0001 (Kruskal Wallis test with multiple comparisons). Source data are provided as a Source Data file.

Loss of CDK12 exacerbates DSBs due to co-directional TRCs at early-replicating regions

The data above suggested TRCs as a cause of DNA damage proximal to newly synthetized DNA. To seek direct evidence, we mapped early replicating regions by EdU-HU-seq19 and DSBs by BLISS58,59. Since BLISS signals are inherently sparse and “digital”, due to their single-base resolution, we devised an algorithm to identify genomic regions more prone to undergo DSBs (i.e. enriched in BLISS signals), henceforth dubbed as BLISS+ regions.

Of note, while BLISS signals could be detected in any of the conditions considered, BLISS+ regions could only be identified in cells when MycER was activated, possibly reflecting the stochastic nature of DSBs arising in either wild-type or shCDK12 cells. Activation of MycER in shCDK12 cells triggered a four-fold increase in the number of BLISS+ regions (5473 vs 1637, Fig. 7a). Of these, only a minority (12%) was detected also in shNT-MycER cells (Fig. 7a). In addition, BLISS+ regions detected in shCDK12 cells were more conserved among replicates than those detected in wild-type cells (Supplementary Fig. 24a), suggesting that these were recurrent hotspots for DSBs. Also, while BLISS+ regions detected upon MycER activation in shNT-cells were predominantly distributed in either intergenic regions, in shCDK12 cells, BLISS+ regions were preferentially mapped at either promoters or intragenic regions (74%, Fig. 7b). This suggested that transcription was a causal factor in the generation of DSBs when MYC was activated and CDK12 silenced. Given the role of early DNA replication as a potential source for DNA-damage in MYC-overexpressing cells19, we used EdU-HU-seq to identify early replicating regions (ERRs, genomic regions enriched in early replicating origins). In shCDK12-MycER cells, half of the BLISS+ regions overlapped with ERR (47%) (Fig. 7c and Supplementary Fig. 24b), either at boundaries or within-ERRs (Fig. 7d and Supplementary Fig. 25a), while the remaining were proximal to sparse EdU signals (Supplementary Fig. 26). ERR-associated BLISS+ regions had two peculiar topological features: (i) they were often positioned at the boundary of the EdU-HU-seq signal, (either on the left or on the right side) and (ii) they were adjacent to the promoter of an expressed gene (Fig. 7e; Supplementary Fig. 25b and 26). Directionality analysis of both DNA and RNA synthesis, indicated that these events were predominantly co-directional, so that almost invariably, BLISS+ regions were positioned between the front of an incoming early DNA replicating locus and the promoter of an expressed gene (Fig. 7e, f and Supplementary Fig. 26). In line with the above, the majority of the BLISS+ regions outside ERR (about 56%) were also associated with sparse EdU-HU-seq signals (due to DNA synthesis from interspersed early replicating origins) and transcribed genes, mostly in a co-directional head to tail orientation of the upcoming DNA replication front and the transcribed gene (Supplementary Fig. 27). On the other hand, in shNT-MycER cells, only 26% of the BLISS+ regions were near ERR (Fig. 7c), and, when proximal to early replicated DNA, they were less frequently associated with transcribed genes (56% of the BLISS+ regions did not overlap with expressed genes) (Fig. 7b). In addition, these BLISS+ regions were neither strongly associated with promoters, nor with the other topological features found in shCDK12-MycER cells, since only a minority were positioned between codirectional ERR and transcribed genes (Supplementary Fig. 28, 29). The observation that the DSB-prone regions in shCDK12-MycER cells were associated with expressed genes strongly suggested that loss of CDK12 exacerbated MYC-driven TRCs, leading to accumulation of unresolved DSBs. We next asked whether, at these loci, DNA synthesis and/or RNA expression were modulated by either MYC activation or silencing of CDK12. EdU-HU-seq indicated that DSBs-prone regions in shCDK12-MycER cells (BLISS+ regions) were the replicating regions the most stimulated by MycER activation (Fig. 7g, Supplementary Fig. 30). We also noticed that silencing of CDK12 led to a general reduction of MYC-induced DNA replication since the EdU incorporation levels were lower than those assessed upon MycER activation in mock silenced cells (Fig. 7g, h, Supplementary Fig. 30). This potentially reflected the activity, in trans, of the replication checkpoint. Silencing of CDK12 (alone) did not affect early-replication (Fig. 7g, Supplementary Fig. 30). In addition, while transcripts near BLISS+ regions were not preferentially regulated by MycER activation or by CDK12 silencing, they were synthesized more rapidly than all the other expressed genes (Fig. 7i). Thus, these DSBs were generally associated with regions of strong MYC-induced DNA synthesis and high rates of RNA synthesis, suggesting that (i) loss of CDK12 exacerbated TRCs in regions of intense RNA and DNA synthesis and (ii) these conflicts were triggered by MYC due to its ability to anticipate and boost DNA synthesis (Supplementary Fig. 31).

Fig. 7: Genome-wide mapping of transcription replication conflicts.
figure 7

a BLISS+ regions identified upon MycER activation and CDK12 (shCDK12) or mock silencing (shNT). In dark gray is the fraction of BLISS+ regions identified in both the shNT and shCDK12 datasets. b Pie charts of the genomic distribution of BLISS+ regions. c Bar plot of the BLISS+ regions identified upon MycER activation and the fraction of BLISS+ regions overlapping with early replicating regions (ERR, indicated in brown). d Genome browser snapshot of the indicated chromosome 1 region (ROI) showing ERR, EdU-HU-seq and EU-seq signals (for MycER activated, shCDK12 cells), and BLISS+ regions (MycER activated cells). e Genome browser snapshots of representative genes identified next to a BLISS+ region and a ERR. Arrows indicate the direction of DNA replication (EdU-HU-seq) and transcription (EU-seq). f Left, clustered heatmaps of BLISS+ regions overlapping with ERR. Heatmaps show promoter position along with EdU-HU-seq (DNA replication), BLISS+ regions (DSBs) and EU-seq signals of 10 kb genomic ranges centered on the BLISS signal. The red bar highlights the clusters showing co-directional transcription and DNA synthesis. Right, signal distribution profiles of the regions of each cluster. Arrows indicate the direction of DNA synthesis (assessed by EdU-HU seq) and RNA synthesis (by EU-seq). g EdU-HU-seq signal profiles of the left boundary of ERRs overlapping with a BLISS+ region (left) or ERRs not overlapping with a BLISS+ region (right). h Box plot of the EdU-HU-seq signal shown in (g). No BLISS = 3236 regions; BLISS + = 398 regions. The box plot displays 95% of the distribution with the median, error bar indicates the minimum and maximum value. i RNA synthesis (EU-signal distribution) of genes next to ERR adjacent to BLISS+ regions (red line), all the expressed genes (black line) or the expressed genes adjacent to an ERR not associated to a BLISS+ region (gray line). Vertical bars indicate the median. Source data are provided as a Source Data file.

Discussion

Transcriptional silencing upon DNA damage is a process regulated by signaling pathways activated by the DDR60. This process is triggered by multiple DNA lesions, including DSBs. By inhibiting the activity of transcriptional complexes, cells avoid unnecessary synthesis of potentially faulty mRNAs (a source of transcriptional dependent mutagenesis)61, prevent molecular crowding by transcriptional complexes and repair factors, and create a molecular scaffold (e.g., R-loops) to facilitate DNA damage resolution. Current models highlight the prominent roles of apical regulators of the DDR, like ATM and PARP1, which control chromatin remodeling, deposition of repressive histone marks and accessibility at damaged sites56,62, thus leading to repression of gene transcription. Our data shows for the first time that CDK12 is recruited to transcribed loci upon DNA damage to repress gene’s expression (Supplementary Fig. 32). This implies an active mechanism that allows selective localization of CDK12 at damaged sites. Genetic dissection of potential upstream regulators revealed that CDK12 localization at these sites is independent from the DDR signaling controlled by ATM, and from γH2AX but instead relies on PARP activity. This may suggest a branched control of transcription at damaged sites, whereby ATM regulates the epigenetic state and accessibility56 while PARP may have the added function of regulating RNAPII activity by recruiting CDK12. Since several RNAPII-associated factors are PARylated following DNA damage55, it is possible that these may also promote CDK12 localization, possibly contributing to transcriptional inhibition at damaged loci. Selective recruitment of CDK12 at damaged DNA also required an elongation-competent RNAPII, as it was impaired by inhibition of initiation, by triptolide, or inhibition of elongation by DRB.

RNAPII escape from promoters is dynamically regulated by the association with the NELF complex and the DSIF complex, which, by binding to “opposite” sides of the RNAPII holoenzyme control its pausing and pause-release. Two subunits of the NELF complex, NELF-A and NELF-E, are recruited to RNAPII upon DNA damage to repress transcription57. Our data suggest that this is independent of CDK12 since (i) silencing of CDK12 or its inhibition did not prevent NELF-E localization to damaged DNA and (ii) silencing of NELF-E or the full complex did not affect CDK12 localization to damaged DNA. Instead, CDK12 recruitment required SPT5, one of the two subunits of the DSIF complex. This suggests an independent control of DNA damage-induced pausing by the NELF complex and by the CDK12/DSIF complex.

DSIF, once phosphorylated by pTEFb, is converted into a positive elongation factor stably associated with RNAPII. Therefore, our observation that CDK9 inhibition by DRB suppresses localization CDK12 at damaged and transcribed loci suggests that the “elongating” form of SPT5 favors CDK12 recruitment. This also implies that, at least in principle, CDK12 could also support transcriptional inhibition at intragenic DNA damaged sites, and not only at promoter proximal sites.

CDK12 is recurrently mutated in several cancers: in ovarian and prostate cancers63,64 homozygous (loss of function) mutations are associated with genomic instability characterized by tandem focal duplication at gene-dense regions located at early and late replicated domains63,65,66.

While CDK12 was reported to regulate processing and expression of genes responsible for DDR signaling and homologous recombination, these genes are not altered in CDK12 mutant tumors63. In addition, CDK12 loss of function tumors display genomic instability, mutational profiles and therapeutic responses that are different from homology-repair deficient (HRD) tumors63,64,66,67, thus indicating that CDK12 mutant tumors are distinct from HRD-tumors and that loss of CDK12 affects genome integrity also by mechanisms distinct from homology directed repair.

In light of our findings, we propose that CDK12, besides regulating processing and expression of HR-genes, may protect genome integrity by repressing gene transcription and thus facilitating the resolution of DNA damage due to TRCs68.

CDK12 is also recurrently amplified in tumors of different origin, suggesting its potential duality in cancer as either an oncogene or a tumor suppressor, depending on the context39,69. This is similar to other DDR genes, which, depending on the genetic background, may support or suppress tumor growth. Coherently with a protective function of CDK12 in tumor cells, we and others have found that CDK12 is required to prevent cytotoxic DNA damage and favor cell survival upon oncogenic activation of MYC47,51. In particular, we report here that loss of CDK12 predisposes cells to TRCs and RS. Replicative stress has emerged as a hallmark of oncogene driven proliferation which may act as a tumor suppressive barrier or a therapeutic liability6. Several lines of evidence indicated that MYC-induced RS is restrained by safeguarding mechanisms that, once activated, lead to accumulation of cytotoxic DNA damage24,70. Still to be fully established are the potential causes of RS and the engaged safeguarding pathways70. Recent evidence suggests that anticipated S-phase entry driven by MYC may predispose TRCs, thus leading to fork stalling and accumulation of DNA breaks19. In line with this, RS triggered by CDK12 loss led to the accumulation of DNA breaks preferentially located between early replicated regions and genes transcribed in a co-directional orientation. Such regions were characterized by strong MYC-driven DNA replication, but were not selective for MYC-regulated genes. This suggests that MYC-induced DNA replication, rather than transcription, is precipitating these TRCs. Based on the evidence presented here, it is likely that loss of CDK12 (which leads to DDR resistant gene expression) may affect how efficiently DNA damage is resolved at these sites. It thus follows that the increased detection of DSBs might be more likely due to defective repair than to increased TRC frequencies. Other possibilities exist: for instance, we cannot exclude that DDR activation at stalled forks (in the absence of DSBs) might activate CDK12 to stop transcription and avoid fork collapse and DSBs.

Several factors have been implicated as causal in RS-induced genomic instability, here the inactivation of CDK12 allowed to map the precise location on DSBs. This suggests that the management of DNA damage at these sites of potential TRCs is a major liability in MYC-driven cancers. On a broader scale, this may imply a prominent role for TRCs as a source of genome instability that could be exploited therapeutically to induce cytotoxic DNA damage.

We posit that future advancements in the mechanistic understanding of processes controlling transcription at damaged genes have potential for the identification of valuable therapeutic targets in cancer treatment. While loss of function analyses have allowed the identification of factors that regulate CDK12 recruitment to damaged genes and the dissection of how CDK12 modulate RNAPII activity, lack of biochemical data and structure-and-function analyses prevents a precise definition of how CDK12 is recruited to damaged genes or a detailed description of how CDK12 may control CDK9 recruitment. Further studies will be needed to fully define this pathway.

Methods

Cell lines

HeLa, U2OS, U2OS-TRE-I-SceI-19 (kind gift from Dr. Yasui Akira, (Tohoku University, Japan) and U2OS-H2A.X KO cells (kindly provided by Stephen P. Jackson, Wellcome Trust/CRUK Gurdon Institute and Department of Biochemistry, University of Cambridge, UK) were cultured in Dulbecco’s Modified Eagle’s Medium (DMEM) supplemented with 10% fetal bovine serum (FBS), 1% penicillin/streptomycin in 5% CO2 incubator. For CDK12 stable knockdown, U2OS-MycER cells (bleomycin resistant)27 were transduced with doxycycline inducible human shRNAs (Transomic) were cultured in DMEM containing 10% Tetracycline-free FBS. iCut-RPE-1 hTERT cells71 (kindly provided by René H. Medema, Oncode Institute, Division of Cell Biology, The Netherlands Cancer Institute, Amsterdam) were maintained in DMEM/F12 GlutaMAX medium supplemented with 10% Tetracycline-free FBS, 1% penicillin/streptomycin and Puromycin (20 ug/ml).

siRNAs, Plasmids and Transfection

Human siRNAs against CDK12 and siRLuc were purchased from Ambion (silencer select s28621, s28623). For all siRNA transfection, cells were reverse-transfected using Lipofectamine RNAiMAX reagent (Invitrogen by Life Technologies, Monza, Italy).

All plasmids were transfected using Lipofectamine 3000 reagents (Invitrogen by Life Technologies, Monza, Italy) according to the manufacturer’s instruction.

Cloning

The mCherry-CDK12 expression plasmid was generated by PCR using the pRSET-mCherry and PTX002-3XFLAG-CDK12 plasmids as templates. Briefly, a first PCR product (T1) was generated by using a forward primer designed to introduce the sequence of the NheI restriction enzyme at the 5’ end of the mCherry cDNA and a reverse primer designed to add the 5’ end of the CDK12 cDNA to the 3’ end of the mCherry cDNA. A second PCR product (T2) was generated using a forward primer designed to add the 3’ end of the mCherry cDNA to the 5’ end of the CDK12 cDNA and a reverse primer designed to overlap the sequence of the XmaI restriction enzyme present in the 5’ portion of CDK12 cDNA. Next, the two PCR products were used as template of a third PCR aimed to generate the final fragment, then inserted in the PTX002 plasmid by NheI (NEB) and XmaI (NEB) digestion and ligation. The following primers were used: T1 FWD CATCATGCTAGCCACCATGGTGAGCAAGGGCGAGGAGGA; T1 REV CCCATGTCTCTCTGAATTGGGCTTGTACAGCTCGTCCATGC; T2 FWD GCATGGACGAGCTGTACAAGCCCAATTCAGAGAGACATGGG and T2 REV CTTTTTCGGTCTGTTTAGCTTTTAGTAAG.

Western Blotting

Cells were lysed in lysis buffer containing 20 mM Hepes, 0,5 mM NaCl, 5 mM EDTA, 10% Glycerol, 1% Triton-X with protease and phosphatase inhibitors (Roche), kept 15 min on ice and then subjected to 10 cycles of sonication (10 s on, 10 s off) with the Bioruptor Plus (Diagenode UCD-300).

Nascent RNA labeling (EU labeling)

U2OS cells were reverse transfected and grown on glass coverslips for 48 h. Then, cells were irradiated at 10 Gy and fixed 30 minutes after IR. Alternatively, cells were irradiated with UV (15 J/m2. Cells were pulse-labeled with 1 mM EU for 20 minutes before collection and fixation with 4% PFA. EU incorporation was detected by Click-iT technology (Invitrogen) with an Andy fluor 488 azide, according to the manufacturer’s protocol. Image analysis was performed measuring nuclear EU signal intensity with a custom script.

(https://github.com/0774v10/CDK12-MYC-transcription-replication/compare/v1.0.0…main).

Laser micro-irradiation

Cells were grown on glass coverslips and treated with 10 µM BrdU for 24 h to sensitize DNA to laser induced damage. To generate localized DNA damage, selected nuclear regions were microirradiated with 10 iterations of a 405-nm laser with 100% power using confocal Spinning Disk microscope (Olympus) equipped with IX83 inverted microscope provided with an IXON 897 Ultra camera (Andor) and a FRAP module furnished with a 405 nm laser microscope. Images were acquired every 388 ms for a total duration of 5 minutes. The signal intensity at DNA damaged sites was measured using the ImageJ software with a custom script from the IFOM imaging facility. For silencing experiments, cells were seeded on glass coverslips in reverse transfection with siRNAs directed to the indicated targets for 24 h and subsequently transfected with mCherry/EGFP plasmid of interest for an additional 24 h prior BrdU treatment.

Alt-R crRNA design and transfection

Alt-R crRNA were designed with Cas9 crRNA Design tool (Integrated DNA Technologies). On-target and off-target scores were determined using IDT specificity score and CRISPOR tool72. tracrRNA:crRNA duplex was generated and transfected according to the manufacturer’s protocol. In brief, iCut-RPE-1 hTERT cells were treated with doxycycline (Sigma, 1 mM), SHIELD-1 (Aobious, 1 μM) for 17 h in order to activate Cas9. The tracrRNA:crRNA duplex (10 nM final concentration, from a 1 µM stock) was transfected using Lipofectamine RNAiMAX (Invitrogen by Life Technologies, Monza, Italy) for 8 h.

Cell cycle synchronization

Asynchronous U2OS-MycER cells were seeded and treated with 1 µg/mL doxycycline and 400 nM OHT for 24 h and then treated with Nocodazole (50 ng/ml) for 18 h. Cells were then detached by mitotic shake off, washed with PBS, seeded and collected at the indicated time points. Time 0 represents cells in G2/M, collected at the end of the Nocodazole treatment.

Monitoring MS2 transcription in U2OS-TRE-I-Sce-I reporter cell line

U2OS-TRE-I-Sce-I reporter cells were seeded on glass coverslips and transfected with the mCherry-tTA-ER plasmid, the pYFP-MS2 plasmid and the pCMV-NLS-I-SceI plasmid (to induce DSBs). The mCherry-tTA-ER chimera was activated by 1 µM OHT for two hours.

For silencing experiments, cells were first seeded on glass coverslips and reverse transfected with siRNAs. After 24 h, cells were transfected with the mCherry-tTA-ER, pYFP-MS2 and pCMV-NLS-I-SceI plasmids. 24 h post-plasmid transfection, mCherry-tTA-ER was activated by 1 µM OHT for 2 hours. Cells were then fixed with 4% PFA and processed for γH2AX staining and imaging acquisition.

Cell cycle analysis by FACS

For cell cycle analysis using BrdU incorporation, asynchronous growing cells were pulse-labeled with BrdU (100 μM) for 30 min prior to harvesting. After fixation with cold 70% EtOH, cells were treated with 1 ml of 2 N HCl for 20 min to denature labeled DNA. HCl was then neutralized with addition of 3 ml of sodium borate (0.1 M, pH 8.5) for 2 min. Cells were then incubated with anti-BrdU antibody (BD Biosciences). Following incubation with secondary fluorescence labeled antibody, cells were resuspended in PBS with propidium iodide (25 μg/ml) and RNase A (250 μg/ml). Cells were stored at 4 °C until FACS analysis.

Immunofluorescence analysis

Cells were grown on coverslips, fixed with 4% paraformaldehyde (PFA) for 15 min and permeabilized with 0.2% Triton-X100 for 10 min before blocking with 2% BSA in PBS for 30 min at RT. Cells were incubated 1 h at RT with the primary antibody in 2% BSA/PBS, washed three times with PBS and incubated 1 h at RT with the secondary antibody in 2% BSA in PBS. DAPI staining was performed for 5 min before mounting. Slides were imaged using a wide field microscopy (Leica DM6 B Multifluo) with 60X oil objective.

Proximity Ligation Assay (PLA) and PLA at replication forks

For PLA, we used the DuolinkR PLA Fluorescence Protocol Kit (SIGMA) according to the manufacturer’s instructions.

For PLA at nascent DNA73, cells were pulse-labeled with EdU (10 µM) for 10 min and permeabilized with 0.5% Triton in PBS for 10 min at 4 °C. After washing with PBS, cells were fixed with 3% formaldehyde/2% sucrose in PBS for 10 min and blocked in 3% BSA in PBS for 30 min. Then, Click-IT reaction with biotin-azide was performed for 30 min using the following reagents: Biotin-azide 10 µM (Invitrogen B10184), Sodium Ascorbate 10 mM (Sigma Aldrich A4034), CuS04 10 mM (Sigma Aldrich C8027) in PBS. Primary antibodies incubation was performed in PBS with 1% BSA and 0.1% saponin overnight at 4 °C. The following day, samples were processed according to the manufacturer’s protocol starting from the “PLUS and MINUS probes” incubation step.

RNA-seq

RNA was then extracted and purified using the Quick-RNATM Miniprep Kit (Zymo Research, cat. #R1055) according to the manufacturer’s instructions. Libraries for RNA-Seq were prepared with the TruSeq RNA Sample Preparation Kits v2 (Illumina) following manufacturer instructions.

ChIP-seq

Cells were fixed with 1% formaldehyde for 10 min and quenched with 0.125 m glycine for 5 min at room temperature. Cells were washed twice with PBS and resuspended in LB1 buffer (50 mM HEPES pH 7.5, 140 mM NaCl, 1 mM EDTA, 10% Glycerol, 0.5% NP-40, 0.25% Triton X-100) for 10 min on ice. After centrifugation, nuclei were extracted resuspending cells at room temperature for 10 min in LB2 buffer (10 mM Tris-HCl pH 8, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA). The extracted nuclei were finally resuspended in LB3 buffer (10 mM Tris-HCl pH 8, 100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 0.1% Na-Deoxycholate, 0.5% N-lauroylsarcosine) and sonicated in order to obtain DNA fragments of 300–100 bp. The lysate from 50 × 106 cells was incubated with 10 μg of RNA PolIIS2p antibody (ChromoTek, E310) previously bound to protein G Dynabeads (Invitrogen) in PBS + 0.5% bovine serum albumin. After the incubation with the antibody, beads were collected using the DynaMag magnet, washed six times with 1 ml of RIPA buffer (50 mM HEPES pH 7.5, 500 mM LiCl, 1 mM EDTA, 1% NP-40, 0.7% Na-Deoxycholate) and once with 1 ml of TE 1X with 50 mM NaCl. De-crosslinking was performed overnight at 65 °C with 150 μl of TE 1X and 2% SDS. DNA was purified with PCR Qiaquick columns (Qiagen) and quantified using PicoGreen (Invitrogen) or QUBIT (Invitrogen).

EdU-HU-seq

U2OS-MycER cells were cultured with 1 µg/mL doxycycline, with or without 300 nM OHT, for 40 h. Cells were then treated with 100 ng/ml nocodazole (Sigma, Cat. No. SML-1665) for 8 h to induce mitotic arrest. G2/M arrested cells were isolated by shake-off, washed with PBS and released in warm medium containing 25 μM EdU (Invitrogen, Cat. No. A10044) and 2 mM hydroxyurea (Sigma, Cat. No. H8627-5G). After 12 or 18 h, cells were collected and fixed in 90% methanol. Next, cells were permeabilized with 0.2% Triton-X in PBS and EdU was coupled to a cleavable biotin-azide linker (Azide-PEG(3 + 3)-S-S-biotin) (kind gift from Thanos Halazonetis) using the Click-IT chemistry. DNA was then purified by phenol-chloroform extraction and ethanol precipitation and subjected to EdU-labeled DNA isolation according to the published protocol74. The eluted DNA was directly used for library preparation using the HT-Chip-seq library preparation75. Samples were spiked-in with 0.03% of NIH3T3 cells’ EdU-labeled DNA. 50 base pair paired-end read sequencing reactions were then performed on an Illumina Nova-Seq 6000. An aliquot of cells was saved for flow cytometry analysis in order to assess the percentage of cells that entered S-phase in each condition. In this case, after fixation, cells were permeabilized with 0.2% Triton-X in PBS, EdU was coupled to a fluor biotin-azide (Alexa Fluor 647 azide, Thermo Fisher Scientific, cat. no. A10277) and genomic DNA was stained with propidium iodide (Biotium, Cat. No. 40017) in combination with RNAseA (Roche, Cat. No. 10109169001) according to the published protocol74.

EU-seq

U2OS-MycER cells were cultured with 1 µg/mL doxycycline, with or without 300 nM OHT, for 40 h. Cells were then treated with 100 ng/ml nocodazole (Sigma, Cat. No. SML-1665) for 8 h to induce mitotic arrest. G2/M arrested cells were isolated by shake-off, washed with PBS, released in warm medium for 12 h and then collected and fixed in 90% methanol. 0.5 mM EU (5-ethynyl-uridine, Jena Biosciences, Cat. No. CLK-N002-10) was added to the cells 2 h before cells’ collection. RNA was then extracted and purified using the Quick-RNATM Miniprep Kit (Zymo Research, cat. #R1055) according to the manufacturer’s instructions. Nascent RNA was biotinylated and purified starting from 50 μg of total RNA. EU was coupled to a cleavable biotin-azide linker (Azide-PEG(3 + 3)-S-S-biotin) (kind gift from Thanos Halazonetis) using the Click-IT chemistry as shown below: 50 μg of RNA (40 μg of human EU-labeled RNA and 10 μg of murine EU-labeled RNA) were mixed with the same volume of 2X Click-IT reaction cocktail (171 mM Tris-HCl pH 8, 8 mM CuSO4, 200 mM sodium-L-ascorbate, 0,1 mM biotin–azide) and incubated for 30 minutes at RT with gentle shaking. RNA was precipitated and the EU-labeled RNA was then isolated using 50 μl of Dynabeads MyOne streptavidin C1 (Invitrogen, Cat. No. 65001) each sample, that were washed before use two times with Binding and Washing Buffer 1X (5 mM Tris-HCl pH 7.5, 0.5 mM EDTA, 1 M NaCl, 0.5% Tween-20), followed by two washes with solution A (0.1 M NaOH, 0.05 M NaCl) and two washes with solution B (0.1 M NaCl). After washing, the beads were resuspended to twice the original volume with Binding and Washing Buffer 2X. RNase OUT (1:100, Invitrogen, Cat. No. 10777-019) was added to RNA and then samples were heated at 70 °C for 5 minutes and placed back on ice to remove secondary structures. Then EU-labeled RNA was mixed with an equal volume of washed beads (100 µL) and incubated for 30 minutes on a rotating wheel at room temperature. The RNA-beads complex was then washed three times with Binding and Washing Buffer 1X and once with RNase free-water. Finally, the EU-labeled RNA was eluted by incubating the streptavidin beads with 55 μl of 2% β -mercaptoethanol (Calbiochem, Cat. No. 444203) in 10 mM Tris-HCl pH. 8 for 1 h at room temperature. 5 μl of each sample were used for Qubit quantification (RNA HS assay kit; Thermofisher, Cat. No. Q32852). Sequencing libraries were prepared with TruSeq Stranded Total RNA Gold (Illumina, Cat. No. 20020598).

BLISS

For BLISS experiments, U2OS-MycER cells were seeded in medium with 1 µg/mL doxycycline, with or without 300 nM OHT for 24 h. Cells were then treated with 50 ng/ml nocodazole (Sigma, Cat. No. SML-1665) for 18 h to induce mitotic arrest. G2/M arrested cells were isolated by shake-off, washed with PBS and released in warm medium for 18 h and fixed according to BLISS protocol59. In brief: cells were harvested in order to have a single cell suspension and then resuspended in pre-warmed PBS/10% FCS in order to have 106 cells/ml (8–10 × 106 cells for each condition). Fixation was performed by adding 2% PFA (Thermofisher, Cat. No. 28908) for 10 min at room temperature, and 125 mM glycine for 5 minutes gently rotating and then 5 minutes on ice. Cell pellets were washed twice with ice-cold PBS and processed as described59. Briefly, cell pellets were permeabilized in buffer consisting of 10 mM Tris-HCl/150 mM NaCl/1 mM EDTA/0.3% SDS pH 8 and DSBs were blunted with the Quick Blunting Kit (New England Biolabs). sBLISS adapters were ligated to the blunted DSB ends with 25 U of T4 DNA Ligase (ThermoFisher Scientific) at 16 °C for 20–24 hours in a 100 μL ligation reaction mix containing 3 μL of 50 mg/mL BSA (ThermoFisher Scientific). gDNA was extracted by proteinase K digestion followed by Phenol-Chloroform purification. DNA was sonicated to obtain fragments with sizes ranging from 300 to 800 bp, with a peak around 400–600 bp. 90–300 ng of sonicated gDNA was In vitro transcribed with the MEGAscript T7 Transcription Kit (Thermo Fisher Scientific), The resulting RNA was first ligated to the Illumina RA3 adapter by T4 RNA Ligase 2 (New England Biolabs), for 2 hours at 25 °C, and then reverse transcribed using the Illumina RTP primer the SuperScript IV Reverse Transcriptase for 50 min. Libraries were amplified with NEBNext Ultra II Q5 Master Mix (New England Biolabs).

siRNA based screen

Automated high-throughput siRNA transfection

The murine siGenome SMARTpool druggable library was obtained from Dharmacon (Thermo Fisher Scientific) in 384-well format. Each well in the library contains a pool of four distinct siRNA oligos targeting different sequences of the target transcript. Five library plates were analyzed targeting a total of 1400 druggable genes plus positive controls for monitoring transfection efficiency (SMARTpool siGENOME Plk1 siRNA), assay specific positive control (SMARTpool siGENOME Rad21 siRNA) and a siRNA targeting the Renilla Luciferase gene (siGENOME Rluc siRNA) as negative control for data normalization.

The esiRNA custom libraries were purchased from Sigma (MISSION® esiRNA, Sigma) in 384-well format. esiRNAs are endoribonuclease-prepared siRNA pools composed of a mixture of siRNAs that all target the same gene. Control esiRNAs targeting mouse Plk1, Rad21 and Rluc were purchased in individual tubes and included in the final 384-well plates.

For validation, silencer pre-designed siRNAs were obtained from Ambion (Silencer®, Ambion) in a 384-well plate format. Three different individual silencer siRNAs were chosen for each gene. Silencer siRNAs against murine Rad21 (#150458) and was used as positive control.

For screening, 800 Bz1 R26-MycER immortalized MEF cells/well were reverse-transfected in 384-well black, gelatin-coated optical plates (Corning) using the Freedom EVO automated liquid handler system (Tecan) with 0.13 µL/well of DharmaFECT1 transfection reagent (Dharmacon, Thermo Fisher Scientific) and 25 nM Dharmacon SMARTpool or Ambion silencer siRNA or 50 nM of Sigma esiRNA. After transfection, cells were cultured for 48 hours under standard conditions (37 °C in humidified atmosphere, with 5% CO2), in the SteriStore automated incubator (HighRes Biosolutions). A total of 6 replicated plates (3 ethanol- and 3 OHT-treated) were transfected for each of the library plates.

Fixing and immunofluorescence staining

Forty-eight hours post siRNA transfection, cells were fixed by adding 40 µl of 4% Paraformaldehyde (PFA) directly to the medium to reach a final concentration of 2%. The plates were then incubated at room temperature for 15 minutes. After that, the medium was removed, and the plates were washed twice with PBS. Cells were permeabilized by adding 20 µl of 0.1% Triton X-100 (in PBS) to each well. After 10 minutes, cells were washed with PBS and blocked with 20 µl of blocking buffer (PBS + 2% BSA + 10% goat serum) for 30 minutes. Cells were then rinsed twice with PBS and incubated with 25 µl of anti-phospho-Histone H2AX (Ser139) antibody (Merck Millipore, Cat#05-636) diluted 1:1500 in 2% BSA PBS, for 1 hour at room temperature. Afterwards, cells were washed two times with PBS and then incubated with 25 µl of secondary antibody dilution (1:400 Alexa Fluor® 488 goat anti-mouse antibody, Invitrogen Cat#A11001) and DAPI (1:4500 in PBS) for 1 hour. Following 3 washes with PBS, 70 µl of PBS was left in the plates and the plates were sealed and stored at 4 °C until imaged.

Image acquisition and processing

Cells were imaged on the Olympus ScanR wide-field microscope. γH2AX-Alexa 488 fluorescence was acquired using BP470-495 excitation filter and BP510-550 emission filter (U-MNIBA3; Olympus) while DAPI was acquired by BP360-370 excitation filter and BP420-460 emission filter (U-MNIBA3; Olympus). Image acquisition was done with a 10X objective, and six different fields of view were recorded for each well, thus covering most of the entire area of a 384-well format.

The acquired images were analyzed using a custom image analysis algorithm (by Adrian Andronache, available upon request), developed and executed in the Acapella software development/run-time environment (Perkin Elmer). The algorithm used some Perkin Elmer proprietary procedures, such as the nuclei detection on the DAPI channel. The image analysis procedure was run on every single field of view acquired on two channels, and eventually merged the results for each individual well by summing or averaging single parameters. In brief:

Step 1: Background removal and uneven illumination correction

Background removal and uneven illumination correction was performed using a sliding parabola transform which divides the input image into two parts: background and foreground signals. The curvature of the parabola was set to 1/500 pixels.

Step 2: Nuclei segmentation and local background definition

First, using Perkin Elmer (PE) proprietary algorithms, we identified all the individual nuclei from the DAPI channel. Then we defined the local background of every nucleus using the Voronoi tessellation. Voronoi tessellation divides the image space by defining region boundaries equally distanced from the neighboring given centers (i.e. nuclei). A maximum border distance of 50 pixels away from the nuclei was imposed. Once the nuclei were segmented and their Voronoi regions were estimated, we defined individual foreground and background masks for every nucleus, these masks were further used to extract phenotype descriptors for every cell.

Step 3: Preliminary features extraction

For all the detected objects (nuclei having both a foreground and an associated local background), a series of basic parameters were directly extracted or derived from the different images to describe the morphology and phenotype of individual cells. First, from the nuclei segmentation we estimated the radius, the area, and the shape of every single nucleus. Second, the nuclear DAPI intensity was defined as the difference between the average intensities of each individual foreground and background nuclei. The DNA content was defined as the product of the DAPI intensity and the nuclear area. Third, from the Alexa 488 channel we estimated all the γH2AX related features as described in the following paragraph.

Step 4: γH2AX foci and pan-nuclei definition

The γH2AX foci were defined using a local adaptive thresholding approach. Using the previously defined nuclear masks, we calculated the average Alexa 488 intensities of every nucleus in the population of negative controls (siRluc). This resulted in the threshold that differentiate the background from γH2AX signal. Further on, this threshold was used to segment the γH2AX foci from the Alexa 488 channel on every single nucleus (within the foreground masks) on all the screened wells. A local adaptation of the threshold was performed for every single nucleus, by correcting for the Alexa 488 background level extracted from the individual local background mask. All nuclei with γH2AX foci covering more than 70% of their surface were declared (labeled) as “pan-nuclear γH2AX”.

Step 5: Phenotype descriptive parameters

For every single well, the information extracted from all the 6 field of view was finally integrated and the following main phenotype description parameters reported:

N°_Of_Nuclei: the total number of nuclei from all the 6 fields of view, remained after the different filtering described previously;

N°_Of_PanPosNuclei: the total number of pan-nuclear γH2AX nuclei from all 6 fields of view;

Percent_Of_PanPos: the percent of pan-nuclear γH2AX nuclei with respect to the total number of nuclei from all 6 fields of view;

Dapi_Intensity: the average DAPI intensity of all detected nuclei;

DNAContent: the average DNA content of all detected nuclei;

NuclearArea: the average nuclear area of all detected nuclei;

γH2AX_Intensity: the average Alexa488 intensity of all detected nuclei;

γH2AX_AreaCoverage: the average γH2AX foci area coverage of all detected nuclei.

Each well of an assay plate was associated with a phenotype descriptive vector (an array of the 8 previously main parameters) and eventually tabulated and associated the biological and technical annotation according to the plate-layout design and siRNA-database definition.

Statistical data processing and analysis

The statistical data processing and analysis was performed independently on two of the main phenotype descriptor parameters extracted by the image analysis: the viability (i.e. the N°_Of_Nuclei) and the DNA damage response (DDR) (the percentage of pan-nuclear γH2AX nuclei: Percent_Of_PanPos). The data of each screened siRNA was available in triplicates sample values for each of the two chemical treatments ethanol (EtOH) and OHT. Each assay plate had a series of control wells (with negative and positive controls for transfection and assay-specific positive control) and multiple experimental samples. From here on, the data of a single well will denote the specific parameter value of any well, independent from the nature of the transfection reagent class (negative or positive controls or experimental samples). The statistical data processing and analysis is organized in the following 5 main steps: intra plate normalization of raw data (on each individual assay plate); intra-treatment data dimensionality reduction; inter-treatment data dimensionality reduction; hit metric estimation; hit calling. The intra-plate normalization of raw data ensures the removal of inter-plate variability. This operation is done on every assay plate with respect to the median data of all the negative controls present on the analyzed plate (siRluc). The normalization procedure was performed only on the viability parameter, as the percent of pan-nuclear γH2AX nuclei is internally normalized to the number of nuclei of a given well. All the samples presenting a very low viability (i.e. less than 100 nuclei within a well) were excluded from the downstream statistical data processing and analysis of both viability and percent of pan-nuclear γH2AX parameters. The intra-treatment data dimensionality reduction merges the replicates (triplicates) values of the wells on the assay plates. This operation estimates the median of the transfection of sample replicates in order to assign one single data value for each well (siRNA) within each experimental condition (EtOH and OHT treatments). The inter-treatment data dimensionality reduction is the operation of combining and confronting the parameter values of single wells in between the different experimental conditions. As such, the viability-ratio was estimated for all the wells by dividing normalized viability in OHT-treated to EtOH-treated samples. At the same time, the DDR-gain was estimated as the difference between the percent of pan-nuclear γH2AX nuclei in OHT-treated and the EtOH-treated samples. Both viability-ratio and DDR-gain previously estimated underwent the classical Z-score hit metric estimation. In the primary screens the entire population of experimental samples was used as reference, under the assumption that most of the siRNAs have a null effect. In the deconvolution/validation screens the population of negative samples present on the analyzed plate was used as reference. Viability Hits and DDR-Hits were called based on Z-score thresholds defined using negative and positive controls as a reference.

Bioinformatic analyses

RNA-seq analysis

Paired-end reads from Illumina HiSeq 2000 were aligned using tophat -r 170 -p 8 --no-novel-juncs --no-novel-indels --library-type fr-unstranded --transcriptome-index transcriptome_index reffile fastq_file_R1 fastq_file_R2 using hg19 genome assembly as reference. Duplicated reads were removed using samtools rmdup -sS bamfile. The number of counts for each gene were computed using featureCounts -T 2 -p -P -a <gtffile > . Differentially expressed genes (DEGs) between experimental conditions (OHT, siCDK12, OHT+siCDK12) and not treated samples were calculated using DESeq2 R Bioconductor package76. All the genes with an adjusted p-value < 0.01 and with a log2FC > 0.5 or < −0.5 were defined as DEG up or DEG down, respectively.

ChIP-seq analysis

Paired-end reads from Illumina HiSeq 2000 were aligned using the BWA v.0.6.2 tool77 with default settings, using hg19 genome assembly. Peaks were called using MACS v2 1.478, using a p-value threshold of 10-5.

BLISS analysis and identification of BLISS positive regions

Raw sequencing data were demultiplexed on Illumina’s BaseSpace to generate FASTQ files, which were processed using the sBLISS pipeline previously described in detail59. The pipeline first identifies reads containing the expected sBLISS prefix consisting of an 8 nt UMI followed by an 8 nt sample barcode, using SAMtools (v1.10)79 and Scan for Matches80, allowing at most one mismatch in the barcode. Prefixes are clipped off and stored, while the trimmed reads are aligned to the human reference genome (GRCh37/hg19) using BWA-MEM22 (v0.7.17-r1188) with default options77. Reads with mapping quality score lower than 30 and PCR duplicates are removed by searching for adjacently mapped reads (at most 10 bp apart along the reference genome) with at most one mismatch in their UMI sequence. The output of the pipeline is a list of unique DSB locations and corresponding number of unique UMIs identified at each position.

We devised a custom R script to identify genomic regions enriched for BLISS signals (BLISS+ regions). Briefly, the hg19 genome was divided in 2 kb bins. For each bin, we calculated the coverage (number of BLISS UMI for each bin) using the GRbaseCoverage2 and the makeMatrixFrombaseCoverage functions from the ChroKit framework81. Then, based on the distribution of the BLISS signal coverage across bins, we set an arbitrary threshold on the BLISS signal (threshold of the bin coverage = 2^4.6) above which bins were called as BLISS + . BLISS+ bins were then collapsed into BLISS+ positive regions by merging BLISS+ bins that were separated by a maximum gap of 2 kb. For genomic analyses we used the union of the BLISS+ regions identified in two replicates for each condition (shCDK12+OHT, +OHT alone).

EdU-seq-HU analysis

Paired-end reads from NovaSeq 6000 were aligned using bwa mem software (hg19 genome assembly and UCSC annotation). Murine spike-in reads (genomic DNA from EdU labeled NIH3T3 cells) were aligned the same way, but using the mm10 genome assembly and ENSEMBL annotation. Bam files were normalized by their spike-in using the following procedure: (i) For each sample, a coefficient (C1) was calculated: C1sample = (1000000/ n of reads) X ratio; where “ratio” is the ratio in ng of spike-in gDNA/ ng sample gDNA. (ii) C1 coefficients were normalized to the C1 coefficient of highest value, thus obtaining the normalized coefficient C2 (C2sample = C1sample/max(C1)). (iii) C2 was used to downsample the reads and obtain the spike-in normalized bam files using the samtools program79: samtools view -b bam -s C2sample > downsampled_bam.

The bam files of the replicates for each experimental condition were then merged together (samtools merge) and bigWig were generated using the bamCoverage function of deeptools v. 3.5.1. EdU-Seq HU peaks were called using epic2 peak caller82. Briefly, bed paired-end files (.bedpe) were obtained from the spike-in normalized bam files (after step 3) using bedtools:

bedtools bamtobed -i reads.bam -bedpe > reads.bedpe.

Then, the following command produced the final bed files:

epic2 -t reads.bedpe -gn hg19 --gaps-allowed 10 --bin-size 400 --output EdU_regions.bed

Heatmaps and figures were generated using ChroKit framework81.

EU-seq analysis

Paired-end reads from NovaSeq 6000 were aligned using STAR with the following command: STAR --runThreadN 8 --runMode alignReads --outSAMtype BAM SortedByCoordinate --genomeDir <index > --readFilesIn reads_R1.fq reads_R2.fq. The number of counts for each gene were computed using featureCounts -T 2 -p -P -a gtffile. Differentially expressed genes between experimental conditions (OHT, siCDK12, OHT+siCDK12) and not treated samples were calculated using DESEq2 R Bioconductor package. To evaluate expression data, we calculated TPMs for each gene. To identify TSSs associated to ERR and DSBs, we took BLISS+ regions identified in U2OS cells upon activation of MycER and CDK12 silencing, and selected those overlapping with the left-boundary of an ERR. We then extracted BLISS+ regions having a TSS mapping on the minus strand. We repeated this procedure selecting the right boundaries of the ERR regions and considering the TSS mapping on the plus strand of the BLISS+ region. We then derived the list of gene symbols of all the TSSs associated with these regions.

Additional information concerning antibodies, plasmids, primers, cell lines, siRNA/shRNA sequences, software and other reagents can be found in the supplementary data 4.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.