Fig. 5

Test method and correlation analysis of mutations in significant non-coding elements with gene expression. a–f Overview of expression correlation test, exemplified by GATA3 and the set of significant TFBS peak elements (TFPs). a Elements are associated to genes using the nearest TSS. b Raw expression levels (log2 RSEM) are obtained for 7382 samples across 22 cancer types and mutated samples are identified. c Expression levels are z-score normalized within each cancer type and d combined. e The p-value of the mutated samples in the distribution of the combined z-score-ranked set is found using a rank-sum test. f p-values of significant elements and their associated genes are shown in a qq-plot with GATA3 highlighted. The red line indicates expected p-values under the null hypothesis of no expression correlation. The combined p-value of the correlation between mutations and expression levels across the set of candidate regions is found using Fisher’s method. Cancer-type abbreviations: LUAD lung adenocarcinoma, BRCA breast cancer, BLCA bladder cancer, CESC cervical squamous cell carcinoma. g Gene-expression correlation for all mutations (both SNVs and INDELs) in significant TFBS sets. Rank-sum test p-values of individual genes are shown as qq-plot. Combined significance across all genes is found using Fisher’s method and shown in upper left corner (similarly for h and i). h Expression correlation for CTCF TFBSs mutated once (black) or twice (green). The combination of p-values was done separately for the set of TFBSs mutated once and twice. i Expression correlation for RAD21 TFBSs mutated more than five times. j Examples of mutated TFBSs and their associated gene-expression distributions in individual cancer types (exemplified genes emphasized in h, i). Expression levels of mutated samples are shown (red circles). The expression correlation significance within each individual cancers type is given below the plot. Cancer-type abbreviations: LIHC liver hepatocarcinoma, BRCA breast cancer, ACC adrenocortical carcinoma