Fig. 1: Study overview. | Nature Communications

Fig. 1: Study overview.

From: Deep learning prioritizes cancer mutations that alter protein nucleocytoplasmic shuttling to drive tumorigenesis

Fig. 1: Study overview.The alternative text for this image may have been generated using AI.

A Somatic mutations from 11 cancer types in TCGA were analyzed to determine their potential effect on nucleocytoplasmic shuttling and revealed a significant enrichment in targeting peptides. A deep learning model, pSAM, was then constructed to precisely predict the protein nuclear localization probability and ab initio inference of protein sequence determinants without knowledge of known targeting peptides based on deep learning coupled with residue-level contribution analysis. The pSAM model and somatic mutation dataset were subsequently used for downstream analyses. B Mutation frequency distribution of the targeting peptide regions and other regions from 11 cancer types in TCGA. The proteins with validated NLS/NES region were included (n = 564 proteins), and the NLS/NES regions were compared with the rest regions. The data are presented as a box-and-whisker graph (bounds of box: first to third quartile, bottom and top line: minimum to maximum, central line: median). C The deleteriousness score distribution of mutations localized in targeting peptide regions and other regions from 11 cancer types in TCGA. The target regions include valid NLS and NES regions. D, E Mutations in NLS regions of BACH2 affect the expression of its transcriptional substrates, (D) CLIP1 and (E) TXNRD1, in the TCGA-UCEC cohort (Mutated group: n = 4 samples; Non-Mutated group: n = 579 samples). The data are presented as a box-and-whisker graph (bounds of box: first to third quartile, bottom and top line: minimum to maximum, central line: median). Two-sided Wilcoxon test was used for (B), two-sided Chi-square test was used for (C), two-sided Student’s t test was used for (D) and (E). Source data are provided as a Source Data file.

Back to article page