Fig. 1
From: Long ncRNA A-ROD activates its target gene DKK1 at its release from chromatin

Clustering and chromatin-association of long ncRNAs. a Schematic representation of the pipeline followed to define the final set of long ncRNAs used in this study, and the k-means clustering parameters. b K-means clustering of 4467 long ncRNAs. Clustering parameters are histone mark signal values (ChIP-seq peaks from H3K4me3, H3K4me1, and H3K27ac in MCF-7), expression (GRO-seq35 and chromatin-associated RNA-seq log(RPKM)), and ChIA-PET maximum interaction scores. All values were rescaled in the range 0 to 1 using min–max normalization. c H3K4me1/H3K4me3 ratio for relevant clusters, y-axis is in log scale. d Comparison of ChIA-PET interaction scores for all clusters. Marked are the clusters with the strongest interactions (ChIA-PET score median = 1 and mean > 0.9) (clusters 4, 6, 7, 13, 15). e Comparison of the long ncRNA chromatin-association character for all clusters. The ratio chromatin-associated RNA-seq RPKM to nuclear polyA+ RPKM is plotted in log2 scale. Clusters with significantly lower chromatin-association are red-marked (Wilcoxon-Mann-Whitney test p-value < 2.2e-16). f Principal component analysis (PCA) biplot showing the multivariate variation of chromatin-enriched (log2 chromatin-association > 0, red) and nucleoplasmic-enriched (log2 chromatin-association < 0, blue) long ncRNAs in terms of five variables: ChIA-PET, GRO-seq, H3K4me3, H3K4me1, H3K27Ac. Points represent projections of the original values in a new space, defined by a new set of variables, the principal components PC1 and PC2, which better explain the patterns in the data. The contribution of the initial variables in explaining the variance within each PC is shown in the barplot underneath. g Boxplot displaying the distribution of the DESeq2 derived log2 fold changes in expression using total chromatin-associated vs. nucleoplasmic RNA-seq, for 1154 significantly chromatin-enriched long ncRNAs (at DESeq2 p-adjusted < 0.1; Supplementary Data 2), 968 significantly nucleoplasmic-enriched long ncRNAs (Supplementary Data 3) and 1807 long ncRNAs not enriched in either of the two fractions (non-significant, ‘NS’). h Boxplot displaying the distribution of ChIA-PET interaction scores for the three sets of long ncRNA defined in (g). Nucleoplasmic-enriched long ncRNAs show on average significantly higher ChIA-PET interaction scores (Wilcoxon-Mann-Whitney p-value = 9.462e-16 to CHR-enriched and p-value < 2.2e-16 to NS). i Boxplot displaying the distribution of DESeq2 generated log2 fold changes CHR/NP for long ncRNAs engaged in strong ChIA-PET interactions (clusters 4, 6, 7, 13, 15; n = 664) and long ncRNAs not engaged in ChIA-PET (‘rest’; clusters 3, 5, 12, 14; n = 1402). Wilcoxon-Mann-Whitney p-value < 2.2e-16. j Boxplot displaying the distribution of expression (GRO-seq35 reads per kb, RPK) of long ncRNAs either engaged or not in strong ChIA-PET interactions, defined as in (i). Wilcoxon-Mann-Whitney test p-value < 2.2e-16. k Boxplot displaying the distribution of expression (GRO-seq RPK) of the significantly chromatin enriched (‘CHR enriched’, n = 1154) and nucleoplasmic enriched (‘NP enriched’, n = 968) long ncRNAs