Fig. 7: An example of Hi-C interaction between ncRNA CASC8 and protein-coding gene MYC.

Long non-coding RNA CASC8 is a breast tissue expressed lncRNA that is significantly mutated in breast cancer samples. Our analysis of HMEC-related Hi-C data shows that this lncRNA is significantly interacting with the promoter of multiple coding genes, including MYC, a known breast cancer-associated gene. There are numerous strong signals of ENCODE predicted ChromHMM potent enhancers, histone active marks H3K27ac, and H3K4me1 (all presented in HMEC) that overlap with CASC8. Notably, a FANTOM5 breast differentially expressed enhancer also overlapped with CASC8. Our analysis of GWAS SNPs and de Novo somatic point mutations revealed that CASC8 covered multiple breast cancer GWAS SNPs and many somatic point mutations related to breast cancer samples. In contrast, MYC does not encompass either GWAS SNP or BC-related somatic mutations. The figure also shows that the MYC gene interacts with PVT1, another significantly mutated lncRNA in the breast cancer sample (30 kb far from MYC). PVT1 is also overlapped with breast tissue-related regulatory features and is a previously known enhancer for the MYC gene. We used MHiC55 to analyze raw Hi-C data and MaxHiC56 to identify significant Hi-C interactions. Significant interactions are shown in blue color. ENCODE predicted chromHMM files chromatin signals were used in peak format.