Fig. 5: Landscape of L1 locus activity across tumor types.

Histogram of a per-sample mean L1 RNA expression and b L1 RT, based on identified transductions, for all 1483 L1HS and L1PA2 loci. Mean expression and mean RT count per sample are weighted by the inverse of samples with the same tumor type, so each type contributes equally to the locus mean. c Heatmap of mean L1 RNA expression (log2 TPM) of each locus within a given cluster (rows) across tumor types (columns). d Heatmap of mean log2 L1 RT count (based on identified transductions) of each locus within a given cluster (rows) across tumor types (columns). The 22q12.1 locus in cervical cancer, starred, had exceptionally high RT count. To enable visualization of the variation across the heatmap, the colors were scaled to a maximum value of 0.4 and this square was marked with an asterisk to indicate its outlier value. c, d All 1483 L1HS and L1PA2 loci have been clustered based on similar expression and RT count profiles across tumor types, resulting in 13 clusters. Clusters are named based on the locus in each cluster with the highest mean RNA expression and sorted from highest (top) to lowest (bottom) mean RT value. To generate each heatmap value, a mean for each locus within each tumor type is first calculated, and then the mean of means for all loci within a cluster is determined. Rows are sorted left to right by highest to lowest total L1 RNA expression (summed across all 1483 loci) per sample. To the left of each heatmap, the row colors annotate each cluster categorically based on RNA and RT. The left column (dark, medium, and light blue) indicates the distribution of RNA expression of each cluster across tumor types, with mean expression ≥0.1 TPM in >15 tumor types for “High”, in 5–15 tumor types for “Medium” and in fewer than 5 tumor types for “Low”. The mean RT count of all loci within the cluster was categorized into “high” (dark green), “low” (medium green), and “least” (light green) based on the histogram in b.