Figure 1

Pipeline for the discovery of consistency in copy number of gain and up-regulation of novel prognosis-related genes in pan-cancer and the gene enrichment analysis of 889 genes with frequent copy number gains (CNGs) and mutational landscape of 95 genes with constant CNGs and up-regulation. (A) This flowchart shows the pipeline for finding the novel prognosis-related genes which consistent with the copy number of gain in CNVs and their corresponding gene expression. The work divides into several steps: Identifying short descriptions containing both cancer and prognosis keywords: [(prognosis OR prognostic) AND (cancer OR tumour OR carcinoma)] from GeneRIF (Gene Reference Into Function) database; Manually curating the data from published literature to extract the corresponding gene names in Human. (B) 2309 genes with different studies (each with unique PubMed ID) extracted from the literature database and identified 2064 genes related to prognostic studies; A gene set of 1820 genes which associated with CNVs;Total number of 1050 prognosis-related genes identified with frequent CNGs based on the cut-off point (ratio of Gain/Loss > 2)and 277 genes associated with CNLs (ratio of Loss/Gain > 2); 889 genes observed as frequent CNGs with number of CNGs TCGA samples >30; Lastly, 95 genes identified with consistent CNGs and over-expression in the same TCGA samples. (C) Gene enrichment analysis of 889 prognosis-related genes with concordant copy number gains (CNGs). The scatterplot presents the summarized GO terms of all 889 prognosis-related genes with CNGs. Circles show the GO clusters and are plotted in two-dimensional space according to other GO terms’ sematic similarities. Y-axis demonstrates the similarity of the GO terms; x-axis indicates the log of corrected P-value (bubbles of right corrected P-values are larger); circle colour represents directly proportional to the frequency of the GO term in the Gene Ontology Annotation (GOA) database (D) A general pan-cancer overview between the correlations of copy number variation (CNV) aspects based on 95 prognosis-related genes with up-regulated gene expression conceivably caused by copy number gains (CNGs). Y-axis shows the alteration frequency in percentage (including both amplification and deletion mutation); x-axis indicates the cancer types. Blue - Deletion; Red- Amplification.