Fig. 1: Induction of dOCRs in HSV-1 infection is associated with downstream transcriptional activity.

a Number of genes with dOCR length greater than the value on the x-axis in mock and HSV-1 WT strain 17 infections with or without PAA treatment (combined data of 2 biological replicates). To avoid having to define a threshold on whether a particular dOCR length is considered as dOCR induction, we visualized dOCR lengths in each condition for all analyzed 4162 genes without read-in transcription in HSV-1 infection (excluding those with a dOCR length = 0). This depicts whether the number of genes with longer dOCRs is generally increased in the respective condition. The y-axis was limited to 500 to highlight differences in the number of genes with long dOCRs between mock and HSV-1 infection. b Hierarchical clustering analysis (Euclidean distances, Ward’s clustering criterion) of log10(dOCR length) for all analyzed genes (i.e., 4162 genes without read-in transcription in HSV-1 infection) of the samples shown in (a). To define clusters, the cutoff on the clustering dendrogram was chosen such that three groups of genes visually identified as showing dOCR induction in the heatmap resulted in separate clusters. Identified clusters are numbered from top to bottom as indicated and marked by colored rectangles. Shades of red indicate clusters with dOCR induction and shades of blue clusters without dOCR induction. c, d Boxplots showing the distribution of read-through transcription (c) and downstream FPKM (d) for the 9 clusters (n = 609, 290, 851, 176, 305, 701, 367, 289, and 574 genes for clusters 1–9, respectively) from (b). Bounds of boxes are the first and third quartiles for each condition. The center (median) is shown by the horizontal line in the box. Whiskers extend to 1.5 times the inter-quartile range. Outliers are shown as small circles, and minimum and maximum values are the lowest and highest circles, respectively. Read-through values and downstream FPKM were calculated as described in Methods from previously published 4sU-seq data (average of n = 2 biological replicates)3. Read-through for mock infection was defined as zero and is thus not shown. Source data are provided as a Source Data file.