Abstract
Clear cell renal cell carcinoma (ccRCC) is the most prevalent type of renal cell carcinoma. However, our understanding of ccRCC risk genes remains limited. This gap in knowledge poses challenges to the effective diagnosis and treatment of ccRCC. To address this problem, we propose a deep reinforcement learning-based computational approach named RL-GenRisk to identify ccRCC risk genes. Distinct from traditional supervised models, RL-GenRisk frames the identification of ccRCC risk genes as a Markov Decision Process, combining the graph convolutional network and Deep Q-Network for risk gene identification. Moreover, a well-designed data-driven reward is proposed for mitigating the limitation of scant known risk genes. The evaluation demonstrates that RL-GenRisk outperforms existing methods in ccRCC risk gene identification. Additionally, RL-GenRisk identifies eight potential ccRCC risk genes. We successfully validated epidermal growth factor receptor (EGFR) and piccolo presynaptic cytomatrix protein (PCLO), corroborated through independent datasets and biological experimentation. This approach may also be used for other diseases in the future.
Similar content being viewed by others
Introduction
Renal cell carcinoma (RCC), one of the most common cancers worldwide, is a type of kidney cancer that initiates in the lining of the proximal convoluted tubule1,2. Clear cell renal cell carcinoma (ccRCC) constitutes 80% of all RCC cases and is particularly aggressive due to its high immune infiltration3,4,5. In addition, over 30% of ccRCC patients suffer from metastasis, which is a significant factor leading to death in ccRCC patients6,7,8. Although several drugs have been utilized for the treatment of ccRCC, the efficacy is still limited due to the heterogeneity of ccRCC9,10,11. Therefore, it is necessary to understand the pathogenesis and identify risk genes of ccRCC, which may be beneficial for early diagnosis and treatment of ccRCC12,13,14.
Cancer is a complex genetic disorder. Its occurrence and progression are associated with the accumulation of driver genetic mutations that provide a selective growth advantage to cells15,16. Consequently, one class of methods to identify cancer risk genes is based on mutation data. In the past years, several cancer sequencing projects have generated mutation data from thousands of cancer patients, enhancing the identification of cancer risk genes15,17. Traditional statistical approaches focus on genes with a higher mutation frequency in the patient cohort than the control cohort18. Youn et al.19 employed the functional impact of mutations on proteins, variations in background mutation rates among tumors, and the redundancy of the genetic code in tumor genome sequencing data to identify key genes in non-small cell lung cancer. Methods like MuSiC20, OncodriveCLUST21, and MutSigCV22 identified cancer risk genes by comparing the observed gene mutations with the predefined background mutation frequency. By now, frequency-based methods have identified many cancer risk genes and enhanced cancer diagnosis and therapy23. However, the genetic foundations of cancer are highly diverse. Except for genes mutated across a large number of patients, some key genes in tumor initiation and progression are observed to be mutated in only a few patients24. For example, PIK3CA, which has been validated to be a ccRCC risk gene by previous studies25,26,27, is mutated in no more than 5% of ccRCC patients27. It is difficult for purely frequency-based approaches to identify these genes with low mutation frequency but high risk.
To address the drawback of frequency-based methods, the interactions among proteins are introduced for cancer risk gene identification, since genes involved in the same signaling and regulatory pathways as well as protein complexes may interact to exert their effects together. Muffinn28 identified cancer risk genes through network propagation, taking into account mutations not only in individual genes but also in their neighbors within the protein-protein interaction (PPI) network. DiSCaGe29 calculated a gene mutation score using an asymmetric spreading strength based on the type of mutations and the PPI network, then produced a ranking of prioritized cancer risk genes. HotNet230 used an insulated heat diffusion process to identify cancer risk genes by propagating heat through the PPI network. nCOP31 employed a heuristic search method to select connected subnetworks from the PPI network based on the mutation data of cancer patients, and then ranked cancer risk genes based on the frequencies of genes appearing in these subnetworks. Both aforementioned methods are unsupervised, which may suffer from the highly diverse genetic foundations of cancer or the noise in the PPI network32. Recently, several supervised methods have emerged as potentially valuable tools for predicting cancer risk genes23,33,34,35,36. For example, Agajanian Steve et al.34 trained a random forest classifier to identify cancer risk genes based on the known cancer-driver mutations. DeepDriver35 used gene mutation types as features, constructed a K-nearest neighbor graph based on the Pearson correlation coefficient, and trained a convolutional neural network to identify cancer risk genes. Nevertheless, different from unsupervised methods28,29,31,37, supervised methods require a substantial amount of known high-confidence risk genes as labeled data for model training23. Unfortunately, the number of known high-confidence ccRCC risk genes is currently limited38,39. For example, there are only 44 ccRCC risk genes in the IntOGen database40. Owing to the reliance on labels, predicting ccRCC risk genes using supervised methods is challenging.
To overcome the limitations of existing methods, we propose a deep reinforcement learning-based approach for ccRCC risk gene identification, named RL-GenRisk (Reinforcement Learning-based GENe RISK). The reinforcement learning-based model leverages environmental interactions for optimization41, tackling the challenge of scant known risk genes. Specifically, RL-GenRisk models the PPI network as the environment and utilizes a graph convolutional network42 to learn state representations. It also incorporates the Deep Q-Network (DQN)43 to combine reinforcement learning with deep neural networks for ccRCC risk gene identification. Moreover, a data-driven reward is designed to facilitate a straightforward method for identifying ccRCC risk genes. By focusing on a sampled subgraph with node features, the data-driven reward effectively leverages information from both the PPI network and gene mutation data. This not only ensures the accurate identification of genes with high mutation frequencies but also enables RL-GenRisk to identify potential risk genes with low mutation frequencies that functionally interact with genes having high mutation frequencies. Extensive experiments demonstrate that RL-GenRisk outperforms the existing methods in the identification of ccRCC risk genes. Furthermore, several potential risk genes are revealed and validated in independent datasets. Specifically, we validated two top-rank genes EGFR and PCLO through statistical and biological experiments. Statistical analyses show significant upregulation of EGFR at both bulk and single-cell levels among ccRCC patients, with a significant association between overexpression of the protein encoded by EGFR and poor survival in ccRCC patients. The in vitro experimental results show that decreased EGFR expression promotes ccRCC cell apoptosis as well as suppresses colony formation and migration, and the use of the EGFR inhibitor erlotinib effectively augments apoptosis and inhibits migration. Moreover, knocking down PCLO expression in vitro significantly inhibited ccRCC progression. Additionally, the in vivo experimental results show that both the erlotinib and EGFR downregulation can significantly repress the growth of ccRCC tumors in mice.
Results
RL-GenRisk framework
We propose a deep reinforcement learning-based approach to identify ccRCC risk genes (Fig. 1), named RL-GenRisk. Fundamentally different from existing supervised deep learning-based methods, RL-GenRisk incorporates the reinforcement learning paradigm. Specifically, RL-GenRisk frames the ccRCC risk gene identification as a sequence decision-making process, formulated as a Markov Decision Process44. This enables RL-GenRisk to integrate reinforcement learning algorithms seamlessly, thereby effectively addressing the inherent challenge of scant known risk genes. RL-GenRisk takes PPI network and gene mutation data as input (Fig. 1A). The PPI network is represented as an undirected graph, with nodes representing genes and edges representing interactions between genes. Gene mutation data includes details on the presence of mutations in ccRCC patients for each gene. RL-GenRisk treats the PPI network with gene mutation information as the environment. The state includes a sampled subgraph with node features. The action is selecting a node directly connected to the sampled subgraph and adding it to this sampled subgraph. Thus, the risk gene identification is framed as a Markov Decision Process of node selection within the PPI network.
A Input data of RL-GenRisk, including a Protein-Protein Interaction (PPI) network and gene mutation data of ccRCC patients from The Cancer Genome Atlas (TCGA). B Overview of the training and identification process. During the training process, RL-GenRisk starts by selecting a node randomly and appending this node to a sampled subgraph (indicated by the red dashed circle). Nodes colored orange are included in the sampled subgraph. Nodes interacting with the sampled subgraph are colored blue, indicating their inclusion in the action space. The policy comprises two graph convolutional layers and a node evaluation network. It takes the current state as input and selects an action based on Q values of nodes in the action space. Subsequently, the state is updated, and the reward is obtained. This process repeats until the sampled subgraph reaches its maximum size. Following training, RL-GenRisk computes Q values for all genes based on an empty subgraph, resulting in a ranking list of risk genes ordered by Q values. C Experiment validation on external data for identified ccRCC risk genes.
Central to RL-GenRisk is a policy that is represented by a neural network and interacts with the environment. The policy takes the current state as input and predicts Q values, representing the probability distribution of all possible actions. The policy of RL-GenRisk consists of two main components: a Graph Convolutional Network (GCN) for learning state representation and a node evaluation network for computing action probability (Fig. 1B top). To enhance state representation, RL-GenRisk employs the GCN42, an inductive graph representation learning method, to capture node representations in the PPI network. Node initial features are derived from the PPI network’s topology information and ccRCC patients’ mutation information. The policy of RL-GenRisk is trained to select the optimal actions by maximizing the reward. In this study, we designed a data-driven reward that focused on the sampled subgraph, considering both information from the PPI network and gene mutation data. The DQN algorithm is employed to update the policy’s parameters. Throughout the training, we employed the ϵ-greedy strategy to choose actions based on Q values, thus boosting RL-GenRisk’s exploratory potential. In the identification phase (Fig. 1B bottom), RL-GenRisk starts with an empty subgraph, incorporates all nodes into the action space, and utilizes the trained policy to calculate Q values for each node. Subsequently, ccRCC risk genes are ranked by Q values, with higher values indicating greater risk. Further details about the RL-GenRisk can be found in the “Methods” section.
RL-GenRisk shows superior performance for ccRCC risk gene prioritization over existing methods
To assess the performance of RL-GenRisk, we utilized RL-GenRisk and eight other existing methods to identify ccRCC genes with gene mutation data of ccRCC patients from The Cancer Genome Atlas (TCGA)45 and five different PPI networks. Six of these eight existing methods are specifically designed for cancer risk gene identification, including nCOP31, DiSCaGe29, Hierarchical HotNet46, HotNet230, Muffinn28, and MutSigCV22. The other two are SVM47 and Random Forest48, which are widely used supervised machine learning methods. The implementation details of SVM and Random Forest are provided in the Supplementary Materials. The PPI networks used in our study included HPRD49, STRING50, Multinet51, IRefIndex52, and HumanNet53. Known ccRCC risk genes were retrieved from the IntOGen40 cancer-specific database, the Network of Cancer Genes (NCG) database54, the Cancer Gene Census (CGC) database55, and a cancer risk gene set extracted from the recent and extensive study conducted by Bailey et al.56. In particular, the ccRCC risk genes included in CGC and Baylei et al. datasets are limited, containing only nine and twelve genes respectively. Consequently, we evaluated different methods using three datasets: IntOGen, NCG, and a “Merged” dataset that combined data from IntOGen, NCG, Baylei et al. and CGC. The genes included in these datasets are provided in Supplementary Data 1. To assess the performance of the different methods, we used discounted cumulative gain (DCG) as one of the primary evaluation metrics, consistent with previous studies29,57. Specifically, the DCG scores were calculated based on the top 100 genes identified by each method. Moreover, the DCG curves of different methods on the IntOGen, NCG, and the Merged ccRCC risk gene dataset are shown in Fig. 2 and Supplementary Fig. 2. Following this, we calculated the normalized DCG (N-DCG), the area under the DCG curve (DCG-AUC), and the average precision (AP) to further evaluate these methods’ performance (as shown in Fig. 2D–F, and Supplementary Fig. 3). Evaluation results indicated that RL-GenRisk outperformed the eight other established methods in identifying ccRCC risk genes, achieving the highest DCG, N-DCG, DCG-AUC, and AP on all used datasets (as shown in Supplementary Fig. 1). As anticipated, MutsigCV showed the lowest performance among the six existing methods specifically designed for cancer risk gene identification due to its reliance solely on mutation data, lacking integration with biological network insights. Muffinn performed well with STRING but showed inferior performance with other PPI networks, highlighting its instability across different PPI networks (Supplementary Fig. 3). While nCOP and Hierarchical HotNet performed well across different datasets and showed more stability across various PPI networks, their performance was inferior to RL-GenRisk (Supplementary Fig. 1 and Supplementary Fig. 3). Additionally, SVM showed the lowest performance among the compared methods. Random Forest performed better than SVM and MutSigCV. However, its performance was inferior to RL-GenRisk. RL-GenRisk achieved the best performance on HPRD (as shown in Supplementary Fig. 1). Therefore, the subsequent analyses were grounded on the high-confidence risk genes (HRGs) identified by the best-performing method, RL-GenRisk with the HPRD network.
A–C Discounted Cumulative Gain (DCG) curves of the top 100 identified genes with RL-GenRisk and other compared methods on the HPRD network, evaluated on the IntOGen, NGC, and the Merged datasets. D–F Boxplot of Average Precision (AP) on different PPI networks across different methods, evaluated on the IntOGen, NGC, and the Merged datasets. Boxplots display the 25th, 50th (median), and 75th percentiles as box bounds, with whiskers extending to minima and maxima within 1.5 × interquartile range. Each box includes 5 data points, corresponding to the results on 5 different PPI networks. G, H Mutation frequency of the top 20 genes identified by RL-GenRisk (G. Before perturbation; H. After perturbation.). Each dot represents a gene. The dots in red indicate that the gene is included in the IntOGen database, and the blue indicates that the gene is not included in the IntOGen database. And the mutation frequency of the genes enclosed by the orange box is less than 10%. Source data are provided as a Source Data file.
Biological network facilitates the identification of low-frequency mutated ccRCC risk genes
Although gene mutation frequency is crucial for assessing cancer associations, not all cancer-related genes have high mutation frequencies58. Therefore, relying solely on mutation data might overlook low mutation frequency, high-risk cancer genes. To evaluate the capability of RL-GenRisk in identifying ccRCC risk genes with low mutation frequency but high risk, we utilized the “maftools” package59 to analyze the mutation frequencies of the top 20 HRGs (Supplementary Table 1) identified by RL-GenRisk in ccRCC patients from TCGA (Supplementary Fig. 5A). Generally, 317 (77.13%) of the 411 ccRCC patients from TCGA exhibited somatic mutations in the top 20 HRGs. Missense mutations were predominant, followed by frameshift mutations, among the identified somatic mutation types. Notably, the mutation frequencies of four genes, including VHL, PBRM1, SETD2, and BAP, exceeded 10%. These genes are already recognized as ccRCC risk genes and included in the IntOGen database. Genes with high mutation frequencies are more likely to be detected by methods that solely rely on mutation data. However, it’s important to note that not all cancer genes have high mutation frequencies58. Among the top 20 HRGs, seven recognized ccRCC risk genes in the IntOGen database had mutation frequencies below 5%. For example, only 5 of 411 ccRCC patients carry mutations in PIK3CA, a known ccRCC risk gene in the IntOGen database, indicating variable mutation frequencies among cancer genes. To assess the contribution of the biological network in RL-GenRisk for identifying low mutation frequency, high-risk ccRCC genes, we progressively increased the percentage of edges randomly swapped between node pairs from 25% up to 50%, 75%, and 100%. Perturbing the PPI network disrupts its internal information, with more extensive perturbation causing greater information loss. As the perturbation ratio increased, the rankings of known ccRCC genes that initially had high ranks and mutation frequencies below 5% gradually dropped (as shown in Supplementary Fig. 5B). We then compared the mutation frequencies of the top 20 HRGs identified by RL-GenRisk before and after PPI network perturbation and created dot plots for visualization (Fig. 2G, H). We found that after network perturbation, three known ccRCC risk genes, TP53, PIK3CA, and SPEN, previously identified in the top 20 with mutation frequencies below 5%, dropped out of the top 20. This indicates that incorporating PPI network knowledge improves the detection of low mutation frequency ccRCC genes with significant risk implications.
Biological function analysis of high-confidence risk genes
To delve into the biological function of HRGs, we conducted pathway enrichment analysis using the top 20 HRGs and the WikiPathways database60, a comprehensive resource for pathway-based data analysis. Enrichment analysis was performed using g:Profiler61. Figure 3. A illustrated the top 10 significantly enriched pathways (FDR p-value < = 1.21e-4). The results demonstrated significant enrichment in various cancer-related pathways, with “Clear cell renal cell carcinoma pathways” being the most prominently enriched (FDR p-value = 1.48e-8). The Human Phenotype Ontology (HPO) database62, which provides a standardized vocabulary of phenotypic abnormalities associated with human diseases, highlighted “Renal neoplasm” and “Renal cell carcinoma” as the most significantly enriched phenotypes (FDR p-value < 4.05e-5, Supplementary Fig. 6). Furthermore, Gene Ontology (GO) enrichment analysis indicated that top 20 HRGs were notably enriched in various cancer-related biological processes, including “cell adhesion” (FDR p-value = 2.69e-4), “regulation of cell population proliferation” (FDR p-value = 3.64e-4), “cell population proliferation” (FDR p-value = 6.85e-4), and “cell migration” (FDR p-value = 2.39e-3). These observations showed a comprehensive molecular landscape linked to HRGs within the scope of ccRCC, emphasizing the critical roles of these genes in ccRCC-related mechanisms.
A Top 10 enriched pathways of the Wikipathways based on the top 20 high-confidence genes (HRGs). B The expression levels of EGFR in tumor (n = 523) and normal tissues (n = 72) of ccRCC patients from TCGA, quantified using Fragments Per Kilobase Million (FPKM), “***” indicates p-value < 0.001. The boxplot displays the 25th, 50th (median), and 75th percentiles as box bounds, with whiskers extending to minima and maxima within 1.5 × interquartile range. p-value = 2.07e-38, using the Wilcoxon signed-rank test. C, D Survival curves of EGFR. ccRCC patients from TCGA were divided into two groups based on the top 25% (n = 108) and others (n = 324) of expression levels of protein encoded by EGFR, using Progression-Free Survival (PFS) and Disease-Specific Survival (DSS) respectively. The red curve corresponds to the high expression group, while the blue curve corresponds to the low expression group. E UMAP plot of clustering of single-cell RNA-seq data from kidneys of ccRCC patients. DC dendritic cell, NK natural killer cell, NKT natural killer T cell, TAM tumor-associated macrophage, T-Reg regulatory T cell. F Gene expression of EGFR in different cells of ccRCC patients, with colors corresponding to the expression values. G Violin plot displaying the normalized gene expression of EGFR in different cells of ccRCC patients in the single-cell RNA-seq data. Source data are provided as a Source Data file.
In addition, to further explore whether other top-performing methods identified similar pathways albeit with different genes, we performed an analysis based on the pathways enriched by the top 20 genes identified by top-performing methods, including RL-GenRisk, nCOP, Hierarchical HotNet, HotNet2, DiSCaGe, Muffinn, and Random Forest. We first calculated the Jaccard similarity coefficients63 between the top 20 gene lists identified by these methods (pairwise comparisons among these 7 methods, totaling 21 pairs). The results showed that 76% (16 out of 21 pairs) of method pairs have a Jaccard similarity coefficient below 0.5 for their identified top 20 genes, indicating that the top 20 gene lists identified by these methods do not exhibit a high degree of overlap overall (Supplementary Fig. 22). After that, we analyzed the intersection of pathways enriched by the top 20 genes identified by top-performing methods. The result revealed that two pathways were significantly enriched (FDR p-value < 0.05) across these methods, including “Clear cell renal cell carcinoma pathway” and “Type 2 papillary renal cell carcinoma”. Notably, the top 20 genes identified by RL-GenRisk exhibited the most significant enrichment in the “Clear cell renal cell carcinoma pathway” among all these methods (as illustrated in Supplementary Fig. 7). The detailed results of enrichment analysis for different methods are provided in Supplementary Data 3.
Differential expression analysis revealed significant differential gene expression of EGFR at both bulk and single-cell levels
Among the top 20 HRGs identified by RL-GenRisk, 8 HRGs are not included in these ccRCC risk gene databases. We then performed differential expression analysis on these 8 HRGs using RNA-seq data from ccRCC patients in TCGA. Differential expression analysis was performed using Limma64, and the significance threshold was FDR p-value < 0.05. Notably, EGFR and PCLO showed significant differential expression between tumor tissues and normal tissues in TCGA among these 8 HRGs (FDR p-value = 2.07e-38 for EGFR and FDR p-value = 1.04e-14 for PCLO, Fig. 3B and Supplementary Fig. 8). The differential expression results for these 8 HRGs are shown in Supplementary Fig. 8. EGFR was notably upregulated in ccRCC tumor tissues compared to normal tissues. EGFR is activated as a homodimer or heterodimer, thereby regulating multiple signaling pathways, including the RAS/RAF/MAPK, AKT, and JAK/STAT pathways, which play essential roles in cell migration, proliferation, and survival65,66,67. These pathways are intricately involved in driving cell proliferation and conferring resistance to apoptosis67. The overexpression of EGFR leads to an excess of receptors on the cell surface, fostering uncontrolled cell growth and division. This dysregulation can drive the transformation of normal cells into tumor cells, creating a favorable environment for sustained tumor cell survival68. Additionally, the protein encoded by PCLO is a component of the presynaptic cytoskeletal matrix, which is involved in establishing active synaptic zones and in synaptic vesicle trafficking69. Recent research has identified the expression level of PCLO as a prognostic biomarker for esophageal squamous cell carcinoma70, and a notable high mutation frequency (47.9%) of PCLO has been observed in large Central European cohorts with gastric cancer71. In addition, to further explore the transcriptional change of these 8 HRGs across tumor and normal tissues, we conducted differential expression analysis using ccRCC patient data from the Gene Expression Omnibus (GEO). We used two mRNA expression profiles of ccRCC patients from GSE4669972 and GSE3689573 for analysis. The GSE46699 contains data from 67 tumor tissues and 63 normal tissues. The GSE36895 contains data from 29 tumor tissues and 23 normal tissues. We found that among these 8 HRGs, 5 genes showed significantly differential expression in both GSE46699 and GSE36895 datasets (FDR p-value < 0.05, as shown in Supplementary Fig. 9). In particular, MUC4 shows high expression in tumor tissues in the GSE36895 dataset, but low expression in tumor tissues in the GSE46699 dataset. Existing studies reported that MUC4 expression is an independent prognostic factor for overall survival in ccRCC patients74, and MUC4 mutation is associated with an exophytic growth pattern of ccRCC75. Then, we compared the differential expression levels of the top 20 ccRCC risk genes identified by RL-GenRisk and other methods using the GSE46699 and GSE36895 datasets. The results indicated that the top 20 genes identified by RL-GenRisk included a higher number (14) of genes that were significantly differentially expressed in both GSE46699 and GSE36895 (FDR p-value < 0.05). Further details can be found in Supplementary Data 4.
To further investigate the expression patterns of the top 20 HRGs in tumor cells and normal cells of ccRCC patients, we performed an analysis utilizing single-cell RNA-seq data from ccRCC patients reported in a recent work76 (Supplementary Table 2). Uniform manifold approximation and projection (UMAP) was used to visualize the distribution of 31,856 cells from kidneys in these patients in a two-dimensional plane (Fig. 3E). We used UMAP to visualize the expression of the top 20 HRGs in different cells (Supplementary Fig. 10) and plotted the distribution of expression of these genes in different cell types (Supplementary Fig. 11). Subsequently, we assessed the expression levels of EGFR across different cell types within these patients (Fig. 3F and Supplementary Fig. 11). Our investigation showed a significant difference in EGFR expression between tumor cells and other cells, (FDR p-value < = 1.21e-4, Wilcoxon signed-rank test, Supplementary Fig. 12), with displaying higher expression levels in tumor cells (Fig. 3G). Therefore, the significant upregulation of EGFR at both bulk and single-cell levels in ccRCC patients suggested its potential as a biomarker for ccRCC.
Expression level of protein encoded by the EGFR is significantly correlated with the prognosis of ccRCC patients
To reveal whether EGFR affects the prognosis of ccRCC patients, the survival analysis is conducted utilizing clinical data and expression of EGFR-encoded protein obtained from the TCGA dataset. We obtained clinical data and reverse-phase protein array (RPPA) data of ccRCC patients from TCGA. Progression-Free Survival (PFS)77 and Disease-Specific Survival (DSS)78 were utilized to assess the relationship between the expression levels of protein encoded by EGFR and the survival of ccRCC patients. The ccRCC patients from TCGA were categorized into two groups (top 25% and others) based on the expression levels of protein encoded by EGFR. We analyzed the ten-year survival rate following cancer diagnosis and plotted Kaplan-Meier survival curves to illustrate the impact of EGFR encoded protein expression on the prognosis of patients (Fig. 3C, D). The results of the survival analysis revealed a significant association between the expression levels of protein encoded by EGFR and the survival time of ccRCC patients, with higher EGFR encoded protein expression being correlated with poorer survival outcomes (KM log-rank p-value = 0.0018, PFS; KM log-rank p-value = 0.021, DSS). Our observations revealed that the overexpression of protein encoded by EGFR may play a critical role in cancer progression and could potentially serve as a prognostic biomarker for ccRCC patients.
EGFR effectively promotes ccRCC progression in vitro and in vivo
To verify the effect of EGFR expression on ccRCC progression, the stable cells harboring EGFR knockdown were obtained using ACHN (Fig. 4A, B) and 786-O cells (Fig. 4G, H). The protein and mRNA expression of EGFR was quantitated by western blotting and qPCR, respectively, and short hairpin RNA (shRNA)-2/3 showed promising knockdown epointing out this issuefficacies for EGFR silencing (Fig. 4A, B, G, H). The downregulation of EGFR significantly inhibits the cell viability (CCK8 assay, Fig. 4C-ACHN, Fig. 4I-786-O) and migration (transwell assay, Fig. 4E-ACHN, Fig. 4K-786-O). Furthermore, decreased EGFR expression markedly promotes the ccRCC cell apoptosis (flow cytometry assay, Fig. 4D-ACHN, Fig. 4J-786-O) and represses cell colony formation in vitro (colony assay, Fig. 4F-ACHN, Fig. 4L-786-O). Also, the EGFR overexpression was detected using qPCR (Fig. 4M) and western blotting (Fig. 4N), and it promotes the migration of 786-O cells significantly (Fig. 4O). Moreover, the EGFR inhibitor, erlotinib, was further used to inhibit its activity, and the data indicated that erlotinib can effectively inhibit ccRCC cell migration (Fig. 5A middle panel) and growth (Fig. 5A lower panel) as well as promotes the apoptosis (Fig. 5A upper panel) in vitro. In vivo, both erlotinib and EGFR downregulation can markedly repress the tumor growth (Fig. 5B, C). Taken together, these results suggested that EGFR can significantly promote ccRCC cell progression as a risk factor.
A, B The protein and mRNA expression of EGFR in stable cells of ACHN-shEGFR. n = 4 independent samples. C. The cell viability of ACHN-shEGFR cells. n = 10 from three biological replicates. D–F The assays of apoptosis (n = 4 independent experiments), migration (n = 5 independent experiments), and colony formation (n = 3 independent experiments) in ACHN-shEGFR cells. G, H The protein and mRNA expression of EGFR in stable cells of 786-O-shEGFR. n = 4 independent samples. I The cell viability of 786-O-shEGFR cells. n = 10 from three biological replicates. J–L The assays of apoptosis (n = 3 independent experiments), migration (n = 5 independent experiments), and colony formation (n = 3 independent experiments) in 786-O-shEGFR cells. M, N Overexpression of EGFR in 786-O cells at mRNA and protein levels. n = 4 independent samples. O The migration assay of 786-O-EGFR cells. n = 5 independent experiments. NC negative control, ns no significant. The data were shown as the mean ± SEM. One-way ANOVA (B–F, H–K). Two-tailed unpaired Student’s t-test (L–O). Source data are provided as a Source Data file.
A Erlotinib markedly promoted apoptosis and inhibited migration and colony formation of ccRCC cells. n = 3 independent experiments. B, C Both erlotinib (B) (n = 3 mice per group) and EGFR knockdown (C) (n = 3 mice per group) significantly reduced tumor growth in vivo. The data were shown as the mean ± SEM. Two-tailed unpaired Student’s t-test (A). Two-way ANOVA (B, C). Source data are provided as a Source Data file.
The PCLO knockdown significantly impaired the ccRCC progression
To investigate the effect of PCLO on the progression of ccRCC in vitro, we utilized four distinct shRNAs to knock down PCLO expression in 293T cells, identifying shPCLO-1 as the most effective (Fig. 6A, B). Subsequently, we obtained the PCLO knockdown 786-O cells using shPCLO-1 (Fig. 6C). The downregulation of PCLO significantly inhibited cell viability (Fig. 6D) and promoted apoptosis (Fig. 6E). Additionally, the reduction of PCLO expression significantly suppressed cell colony formation (Fig. 6F), migration (Fig. 6G), and invasion (Fig. 6H). To further investigate the effects of PCLO knockdown on the morphology and motility of ccRCC cells, a high-content imaging analysis system was employed. The results revealed that PCLO knockdown significantly reduced both the length and width of the cells at multiple time points (Fig. 6I), as well as their area and perimeter (Fig. 6J). Notably, a reduction in cell movement speed was observed in 786-O-shPCLO cells (Fig. 6K), leading to a shortened accumulated distance (Fig. 6L). In addition, we performed the same intervention on PCLO in another renal cancer cell line, ACHN, and conducted related assays to assess proliferation and metastatic potential. The results showed that stable knockdown of PCLO in ACHN cells (Supplementary Fig. 24A) significantly reduced their proliferation (Supplementary Fig. 24B), colony formation (Supplementary Fig. 24D), migration (Supplementary Fig. 24E), and invasion (Supplementary Fig. 24F) abilities, while increasing the apoptosis index (Supplementary Fig. 24C). High-content imaging analysis system also indicated that knockdown of PCLO in ACHN not only decreased cell perimeter but also impaired the cells’ motility (Supplementary Fig. 24G–I). Overall, the PCLO knockdown markedly inhibited the progression of ccRCC.
A, B The protein and mRNA expression of PCLO after transient transfection of shPCLO lentivirus in 293T cells. n = 3 independent samples. C The protein expression of PCLO in cells of 786-O-shPCLO. D The cell viability of 786-O-shPCLO cells. n = 10 from three biological replicates. E–H The assays of apoptosis, colony formation, migration, and invasion in 786-OshPCLO cells. n = 3 independent experiments. I, J Average morphology of 786-O-shPCLO cells per well, including cell width, length, area, and perimeter, was evaluated at multiple time points using a high-content cell imaging analysis system. n = 10 independent samples. K, L Average movement speed and accumulated distance of 786-O-shPCLO cells per well. n = 10 independent samples. NC negative control, ns no significant. The data were shown as the mean ± SEM. One-way ANOVA (B). Two-tailed unpaired Student’s t-test (D–L). Source data are provided as a Source Data file.
Discussion
In this study, we developed RL-GenRisk, an approach utilizing deep reinforcement learning to enhance ccRCC risk gene identification by integrating network knowledge with gene mutation data. By considering the risk gene identification as a node selection problem, we model the ccRCC risk gene identification as a Markov Decision Process, reducing the dependency on labeled data. Furthermore, we designed a data-driven reward and employed the DQN algorithm to optimize RL-GenRisk. RL-GenRisk exhibits a substantial improvement in the task of ccRCC risk gene identification and reveals several potential risk genes.
RL-GenRisk successfully identified known ccRCC risk genes. Among the top 20 ccRCC risk genes identified by RL-GenRisk, 12 genes are listed in the ccRC known risk gene datasets (IntOGen, NCG, and the Merged dataset) and recognized as ccRCC risk genes in prior research, including VHL79,80, PBRM181,82, SETD283,84, BAP185,86, MTOR87,88, ATM89,90, SPEN91, TP5392, KMT2D (also known as MLL2)93, SMARCA494,95, and PTEN96,97. This demonstrates that RL-GenRisk can effectively identify ccRCC risk genes. Interestingly, 8 of the top 20 identified ccRCC risk genes are not listed in the known ccRCC risk gene datasets, including MUC4, DST, PABPC1, PCLO, PDE4DIP, USH2A, EGFR, and FLG. We visualized the top 20 genes identified by RL-GenRisk in the PPI network using the STRING-db. These genes were significantly interconnected within the PPI network (one-sided t-test, FDR p-value = 2.72e-07). To illustrate the interactions between known ccRCC risk genes and the potential risk genes, we used a bipartite graph (see Supplementary Fig. 16). Among the potential risk genes, EGFR exhibited the most interactions with known ccRCC risk genes, followed by MUC4 and PCLO. Notably, both EGFR and PCLO were validated in our biological experiments. In addition, some recent studies have found that PDE4DIP, FLG, and USH2A are associated with ccRCC. For example, methylation levels of PDE4DIP were found to be associated with reduced overall survival in ccRCC patients98. FLG was found to be specifically mutated in specific subtypes of ccRCC99. USH2A was found to have a significant co-mutation with well-known high-confidence ccRCC risk genes VHL and PBRM1 in ccRCC patients100,101. While the genes within these identified candidates have been shown a correlation with ccRCC, their molecular mechanism within ccRCC needs to be further investigated.
Furthermore, we validated two top ccRCC risk genes, EGFR and PCLO, identified by RL-GenRisk through independent datasets and biological experiments. The results indicated that EGFR exhibits significant upregulation at both bulk and single-cell levels in ccRCC patients. The overexpression of the protein encoded by EGFR is significantly associated with poor survival of ccRCC patients. To explore whether the few EGFR mutations detected in ccRCC patients are functionally relevant, we analyzed EGFR protein expression levels between ccRCC patients with and without EGFR mutations, as proteins are the primary functional molecules in cellular processes. We use the ccRCC patient data from TCGA to perform analysis, dividing them into groups with and without EGFR mutations to assess whether the protein expression levels significantly differ between these two groups. The results showed that EGFR protein expression levels were higher in ccRCC patients carrying EGFR mutations (p-value = 0.017, Wilcoxon signed-rank test, Supplementary Fig. 17). Moreover, we visualized the interacting genes of EGFR using STRING-db (Supplementary Fig. 20). It is shown that EGFR interacts with many genes. To further explore the effect of these genes, we conducted pathway enrichment analysis on these genes. The results showed that one ccRCC-related pathway and three cancer-related pathways are enriched, including the ErbB signaling pathway (FDR p-value = 4.99e-2), the PI3K Akt signaling pathway (FDR p-value = 3.41e-7), MAPK signaling pathway (FDR p-value = 2.99e-8), and the CKAP4 signaling pathway map (FDR p-value = 3.54e-7). The ErbB signaling pathway has been found to play a critical role in the initiation and progression of ccRCC102. The PI3K Akt signaling pathway plays a crucial role in various cellular processes and is aberrantly activated in cancers, contributing to the occurrence and progression of tumors103. The MAPK signaling pathway is one such complex interconnected signaling cascade with frequent involvement in oncogenesis, tumor progression, and drug resistance104. The CKAP4 signaling pathway map has been found to be involved in regulating the progression of various cancers105. The enrichment analysis results suggested that the genes interacting with EGFR in the PPI network were significantly enriched in pathways related to the initiation and progression of ccRCC, as well as tumor initiation and progression. Furthermore, through comprehensive biological experiments, we validated the impact of EGFR and PCLO on ccRCC progression. The results demonstrated that decreased EGFR expression promotes ccRCC cell apoptosis and suppresses colony formation. To confirm the status of EGFR expression in ccRCC cells, we performed additional experiments. We found that the mRNA and protein expression levels of EGFR in the 786-O and ACHN cells are significantly higher than that in the human embryonic kidney (293T) cells using qPCR and western blotting analyses (as shown in Supplementary Fig. 23). Accordingly, these results showed that EGFR is overexpressed in both 786-O and ACHN cell lines. Recent studies have indicated that the downregulation of certain receptor tyrosine kinases (RTKs) members can prevent the progression of RCC. For instance, silencing AXL and MET using shRNA may overcome the resistance to long-term sunitinib treatment in metastatic RCC106. Additionally, inhibition of EphA2 can suppress tumor growth both in vitro and in vivo, and restore the sensitivity of sunitinib-resistant tumor cells to sunitinib107. Therefore, we proposed that other RTKs could produce the same effect, even these RTKs would play a synergistic role with EGFR in ccRCC. However, their roles in ccRCC progression need to be further validated in the future. Additionally, the use of the EGFR inhibitor erlotinib effectively inhibits ccRCC cell migration and growth in mice. While previous studies have associated EGFR expression with RCC, functional studies directly examining EGFR’s role in RCC cell lines have been limited. Lee et al. utilized shRNA to knock down EGFR on 786-O cells, suggesting that EGFR knockdown inhibits the invasiveness of RCC cells in vitro and tumorigenicity in vivo108. However, their conclusions were primarily based on cell proliferation and branching morphogenesis assays, as well as tumor xenograft experiments. Similarly, Wen et al. found that EGFR knockdown reversed ADAMTS1-induced prometastatic characteristics of RCC, this was only verified by CCK8 and Matrigel invasion assay109. Therefore, comprehensive experiments are essential to further elucidate the role of EGFR in ccRCC cell lines. In our study, we stably knocked down EGFR expression in both 786-O and ACHN cells using shRNA. We conducted a series of experiments, including CCK8, apoptosis assays, transwell assays, colony formation assays, and in vivo experiments, which comprehensively explored the function of EGFR in ccRCC. Our results demonstrate that knocking down EGFR significantly inhibits both the in vitro proliferation and metastasis of ccRCC cells, as well as tumor growth in vivo. These results not only demonstrate RL-GenRisk’s capability in identifying ccRCC risk genes but also provide further insights into the role of EGFR in ccRCC cell lines. Moreover, the biological experimental results of PCLO showed that knocking down PCLO expression in vitro significantly inhibited ccRCC progression. These biological experimental results highlighted the potential therapeutic significance of our findings.
In addition, we explored incorporating mutation type information and pathogenicity scores into the feature vectors of RL-GenRisk. For mutation type, we utilized the TCGA data to calculate the proportion of each mutation type relative to the total number of mutations for each gene. For the pathogenicity score, we computed the CADD score110 for each mutation, and the average CADD score across all mutations in a gene was used as the pathogenicity score feature. By adding different types of features, we evaluate three variations of RL-GenRisk, including mutation type only, pathogenicity score only, and both. The results show that after adding mutation type information separately to the feature vector, RL-GenRisk showed improvement in AP on the NCG dataset but showed decreases in DCG, N-DCG, and DCG-AUC on the NCG dataset, as well as decreases in AP, DCG, N-DCG, and DCG-AUC on the IntOGen and the Merged dataset (as shown in Supplementary Fig. 13). After adding pathogenicity scores separately to the feature vector, RL-GenRisk showed improvements in AP, DCG, and N-DCG on the NCG and decreases in DCG-AUC on the NCG and the Merged dataset, as well as decreases in AP and DCG-AUC on the IntOGen dataset (as shown in Supplementary Fig. 13). Moreover, after adding both mutation type information and pathogenicity scores to the feature vector, RL-GenRisk showed a decrease in AP, DCG, N-DCG, and DCG-AUC across the IntOGen, NCG, and the Merged dataset (as shown in Supplementary Fig. 13). These results indicated that adding more clinically informative features might improve the performance of the method, but further exploration is needed.
Furthermore, we analyzed the overlap between the results of different methods by calculating the Jaccard similarity coefficient for the top 100 genes identified by any two methods (Supplementary Fig. 18). The result indicated that the top 100 genes identified by Hierarchical HotNet and HotNet2 have the highest degree of overlap (with Jaccard similarity coefficient = 0.39), as they are similar methods. In general, there was a low overlap between the top 100 genes identified by different methods (Supplementary Fig. 18). Only one gene VHL is included in all the top 100 genes from different methods. VHL is a known risk gene for ccRCC and with the highest mutation frequency in TCGA ccRCC patients. In addition, seven genes were identified in the top 100 genes by at least six out of the nine methods, including VHL, PBRM1, SETD2, BAP1, MTOR, SPEN, and ATM. These six genes are all well-established risk genes for ccRCC. These results suggested that while different methods for ccRCC risk gene identification exhibit different performances, few genes like VHL, consistently emerge across multiple methods. Then, we conducted an analysis to determine the minimum number of genes required to encompass all known ccRCC risk genes for each method. Since nCOP only outputs a ranked list of genes and does not allow users to adjust the list length by setting different thresholds, we are unable to assess the minimum number of genes required to encompass all known ccRCC risk genes. For other methods, the results show that all tested methods require at least more than 8000 genes to encompass all known ccRCC risk genes (Supplementary Table 7). This leads to a very low proportion of known risk genes in the identified gene list ( <1%). Including all known risk genes introduces numerous false positives, which significantly decreases the precision of the results. A potential future direction could involve developing methods that can identify all (or most) known risk genes while maintaining high precision.
Since the interpretability of most reinforcement learning systems remains a significant challenge, to enhance understanding, we have incorporated several cues in the output to explain why certain genes are more or less likely to be associated with ccRCC. Our hypothesis is that genes with high ranks are likely to be in close proximity to known ccRCC-related genes. We have provided two types of supplementary information to measure the closeness between the predicted genes and known high-risk ccRCC genes. First, other than gene rank, we also output the average shortest path length between each gene and all known risk genes in the PPI network, as well as the FDR p-values obtained by the one-sided t-test that measures whether the average shortest path length between each gene and the known risk genes was significantly shorter. Second, we output the average cosine similarity between each gene’s feature embeddings and the feature embeddings of known risk genes, as well as the FDR p-values obtained by the one-sided t-test that measures whether the average cosine similarity of feature embeddings between each gene and the known risk genes was significantly higher. Supplementary Table 6 shows an example of the output of the top genes identified by RL-GenRisk. To further strengthen these analyses and facilitate intuitive interpretation, RL-GenRisk can output the comparison of the top K genes with K randomly selected genes (randomly sampled 100 times) in terms of their average shortest path lengths to known risk genes in the PPI network, as well as their average cosine similarities of feature embeddings with known risk genes. Users can evaluate the reasonableness of the top K predicted genes. For illustration purposes, we set K = 20 in this example. The results show that the average shortest path lengths between the top 20 genes identified by RL-GenRisk and known risk genes within the PPI network are significantly shorter (FDR p-value = 2.73e-4, one-sided t-test, Supplementary Fig. 21A), and the average cosine similarities of the feature representations between the top 20 genes and known risk genes are significantly higher (FDR p-value = 4.61e-8, one-sided t-test, Supplementary Fig. 21B). These results indicate that the top 20 genes identified by RL-GenRisk are closely connected to known risk genes in both the network structure and feature space.
To illustrate whether the top 20 genes identified by each method are included in the known ccRCC risk gene datasets, we provided heatmaps in Supplementary Fig. 4. The results indicated that RL-GenRisk identified the highest number of known risk genes (12 genes) among the top 20 genes, followed by nCOP (11 genes), highlighting RL-GenRisk’s capability in identifying ccRCC risk genes. Additionally, we provided the runtime of various methods in Supplementary Table 5. The results showed that HotNet2 and Hierarchical HotNet had the longest runtimes among the methods tested. In comparison, RL-GenRisk was completed in approximately 8 hours. Although some methods had shorter runtimes than RL-GenRisk, their overall performance was inferior to that of RL-GenRisk.
RL-GenRisk can also be applied to identify risk genes for other types of cancer. We evaluated RL-GenRisk and other methods using additional tumor datasets, specifically focusing on Bladder Urothelial Carcinoma (BLCA) and Glioblastoma Multiforme (GBM) data from the TCGA dataset. The known risk genes for BLCA and GBM were extracted from the same source as ccRCC in our study. The results indicated that RL-GenRisk outperformed other methods on both BLCA and GBM (as shown in Supplementary Figs. 14 and 15), highlighting the potential of RL-GenRisk for application in other cancer types.
Methods
Our research complies with all relevant ethical regulations and guidelines. Animal handling and experimental procedures were approved by the Ethical Review Committees of West China Hospital, Sichuan University. The animal studies were authorized by the Animal Ethics Review Committees of the West China Hospital, China (No. 20230214007). All animal experiments were strictly implemented in compliance with the NIH Guide for the Care and Use of Laboratory Animals.
Data preparation
Consistent with previous studies29,31, we used the information from the PPI network together with gene mutation data from patients. The feature matrix H for nodes is constructed based on the topological information of genes in the PPI network, the mutation frequency of genes in ccRCC patients, and the genes’ length.
PPI networks
We collected protein-protein interactions from HPRD49, STRING-db50, Multinet51, IRefIndex52, and HumanNet53. Then, following the previous study31, we performed a two-step preprocess on these PPI networks. First, we excluded the nine longest genes (TTN, MUC16, SYNE1, NEB, MUC19, CCDC168, FSIP2, OBSCN, GPR98) as they tend to acquire numerous mutations by chance and cover many patients31. Second, to mitigate the noise due to the dense connectivity in PPI networks, we applied the diffusion state distance (DSD)111 metric on these PPI networks.
Gene mutation data
We collected somatic mutation of genes in ccRCC patients from The Cancer Genome Atlas. Each gene corresponds to a patient list containing patients who carry mutations in that gene.
Feature representation
The initial feature of gene v is represented as \({{{{\bf{h}}}}}_{v}\in {{\mathbb{R}}}^{1\times k}\), where k is the dimension. In RL-GenRisk, the node feature is flexible and can have different dimensions. We set k = 3 in RL-GenRisk. Therefore, the initial feature of gene v can be represented as \({{{{\bf{h}}}}}_{v}=[{h}_{v}^{(0)},{h}_{v}^{(1)},{h}_{v}^{(2)}]\), which respectively measures the topological importance of v in the PPI network, the mutation frequency in patients, and the length information of the gene. We first use the degree of v as the first dimension of hv since it can represent the topological importance of v in the PPI network:
where Degree(v) represents the node degree of gene v in the PPI network. μdegree is a hyperparameter for normalization. We set μdegree equal to the maximum degree of nodes in the PPI network. Then, the second dimension of hv considers the mutation frequency of gene v in ccRCC patients:
where \({N}_{p}^{v}\) represents the number of patients carrying mutation on v and Np represents the total number of patients. Moreover, the initial feature should consider the length information of the gene, since the longer genes tend to include more mutations by change. To dilute this type of effect, the third dimension of hv is designed as follows:
where Length(v) represents the gene length of v, and α is a normalization parameter. We set α equal to the node number of a connected subgraph in the PPI network that each patient carried mutations on at least one gene in this subgraph. Finally, the initial feature of gene was defined as \({{{{\bf{h}}}}}_{v}=[{h}_{v}^{(0)},{h}_{v}^{(1)},{h}_{v}^{(2)}]\), and the initial feature matrix for all genes was represented as \({{{\bf{H}}}}\in {{\mathbb{R}}}^{n\times k}\), where n is the number of genes.
Key elements in RL-GenRisk
In RL-GenRisk, the identification of ccRCC risk genes is framed as a Markov Decision Process. At each step, the policy of RL-GenRisk receives the current state as input and selects an action. The action is selecting a node that connects with the sampled subgraph and appending this node to the sampled subgraph. Thus, this sampled subgraph adds a new node at each step. A reward is obtained after taking an action. Therefore, there are three key elements in RL-GenRisk, including state, action, and reward. These three key elements are defined as follows:
State
The state s at step t is represented as st, which consists of the feature matrix H and the current subgraph Gt sampled from the PPI network. In the first step, RL-GenRisk creates an empty subgraph and randomly adds a node to it. The design of incorporating the sampled subgraph into the state allows RL-GenRisk to delve into localized network structures. This enables RL-GenRisk to focus on the interaction relationships relevant to the current sampled subgraph at each step. Therefore, the subgraph information in the state enables RL-GenRisk to accurately estimate its current environment and adjust its actions accordingly.
Action
At step t, the action at is selecting a node that connects with the sampled subgraph Gt and appending this node to Gt. The action space at step t is represented as At, which contains nodes that connect with Gt. In detail, after getting Q values for all possible actions, RL-GenRisk uses an ϵ-greedy strategy to select an action at:
where Q(st; a; θ) represents the Q value of action a calculated by the policy based on the current state st, θ stands for the parameters of the policy. At represents the action space at step t. With a probability p not exceeding ϵ, we select an action in action space randomly. Otherwise, the action with the highest Q value is selected. The ϵ-greedy strategy can enhance the exploratory nature of the policy, effectively preventing the model from getting stuck in local optimal during training. Same with the previous study112, we set ϵ equal to 0.95. To balance exploration and exploitation, ϵ is decreased gradually during the training process.
Reward
In reinforcement learning algorithms, the design of the reward is crucial to the performance of the algorithm. In this study, we designed a specific data-driven reward based on the current sampled subgraph. The sampled subgraph Gt at step t is expected to cover more patients so that risk genes are more likely to appear in it. Instead of focusing on an individual gene, RL-GenRisk focuses on a sampled subgraph Gt with node feature matrix H. This makes RL-GenRisk effectively leverage information from both the PPI network and gene mutation data. More importantly, this design of the reward ensures the accurate identification of genes with high mutation frequencies, and also identifies potential risk genes with low mutation frequencies but functionally interacting with genes that have high mutation frequencies. However, only considering the number of patients covered by the sampled subgraph Gt is not enough. Longer genes are more likely to mutate by chance in patients. Therefore, shorter genes with high mutation frequencies are more likely to be risk genes31. To take gene length into account, we integrate gene length information when designing the reward. Thus, the single-step reward is higher when the sampled subgraph covers more patients and the genes in the subgraph have shorter lengths. The single-step reward rt at step t is designed as:
where rt represents the single-step reward at step t. δ is a weight hyperparameter. Rt provides an evaluation score of the current state, with \({N}_{p}^{{G}_{t}}\) representing the number of patients that carry at least one mutation on the genes in the sampled subgraph Gt, \({N}_{p}^{v}\) represents the number of patients carrying mutations on gene v, and Np represents the total number of patients. Length(v) represents the length of gene v. The cumulative reward is defined as the sum of all single-step rewards.
Policy network
In RL-GenRisk, the policy takes the current state as input and outputs Q values for all possible actions. The policy of RL-GenRisk is represented by a neural network which is usually referenced as the policy network. Specifically, the policy of RL-GenRisk comprises two main components: a graph convolutional network (GCN) and a node evaluation network. We also use three multi-layer perceptrons (MLP) to perform dimensionality transformation. The GCN aggregates neighborhood information to get the representation for each node by multiplying the graph laplacian42 with the node feature matrix. Given the feature matrix \({{{\bf{H}}}}\in {{\mathbb{R}}}^{n\times k}\), n is the number of nodes and k is the dimension, the hidden representation of the l-th graph convolutional layer is calculated as:
where Hl represents the hidden representation of the l-th graph convolutional layer. \({{{\bf{L}}}}={\hat{{{{\bf{D}}}}}}^{-\frac{1}{2}}\hat{{{{\bf{A}}}}}{\hat{{{{\bf{D}}}}}}^{-\frac{1}{2}}\) represents the normalized graph laplacian, which is used to aggregate neighborhood information. GCN preserves the original node signal by adding self-connections: \(\hat{{{{\bf{A}}}}}=\tilde{{{{\bf{A}}}}}+{{{\bf{I}}}}\), \(\tilde{{{{\bf{A}}}}}\) represents the adjacency matrix and I represents the identity matrix. \(\hat{{{{\bf{D}}}}}\) represents the degree matrix for \(\hat{{{{\bf{A}}}}}\) and \({\hat{{{{\bf{D}}}}}}_{ii}={\sum }_{j}{\hat{{{{\bf{A}}}}}}_{ij}\), Wl represents the trainable matrix of the l-th graph convolutional layer. σ represents the non-linear activation function.
In RL-GenRisk, we use two graph convolutional layers. Before the first graph convolutional layer receives data, we use an MLP to perform dimensionality transformation on the initial features:
where H represents the initial feature matrix, W1 represents a trainable projection matrix. H0 represents the input feature matrix of the first graph convolutional layer.
Inspired by residual networks113, we concatenate the hidden representation matrices of two graph convolutional layers with H0, and use another MLP for dimension transformation to obtain the final representations matrix \(\hat{{{{\bf{H}}}}}\):
where H1, H2 represent the hidden representation matrices of the two graph convolutional layers, W2 represents a trainable projection matrix, and Concat( ⋅ ) represents the concatenation operator.
At step t, after getting the representations matrix by GCN, RL-GenRisk calculates the representation of the sampled subgraph through the averaging pooling operation and uses the MLP for dimension transformation:
where \({\hat{{{{\bf{H}}}}}}_{{G}_{t}}\) represents the representation of sampled subgraph Gt, \({\hat{{{{\bf{H}}}}}}_{v}\) represents the representation of gene v, Size( ⋅ ) represents the number of nodes in the sampled subgraph, and W3 represents a trainable projection matrix.
Then, we use a two-layer MLP as the node evaluation network to calculate the Q values for each action in action space:
where Q(st; a; θ) represents the Q value for action a based on the current state st, θ stands for the parameters of the policy. \({\hat{{{{\bf{H}}}}}}_{{G}_{t}}\) represents the representation of sampled subgraph Gt, v represents the gene that selected by action a, and \({\hat{{{{\bf{H}}}}}}_{v}\) represents the representation of gene v. W4 and W5 represent two trainable projection matrices. σ represents the non-linear activation function. By using a two-layer MLP that can continuously update parameters during the training process as the node evaluation network, RL-GenRisk can better predict Q values for actions. More details about the policy network can be found in Supplementary Table 3.
Policy training for high-confidence risk gene identification
Consistent with the previous study31, we sampled the training data, randomly collecting 85% samples from 379 ccRCC patients before the training process started. In the training process, since we model the ccRCC risk gene identification as a Markov Decision Process, the policy of RL-GenRisk iteratively receives the current state as input, selects an action, and obtains a reward. The DQN algorithm43, which is widely used in reinforcement learning methods, is employed to train the policy of RL-GenRisk. DQN uses two sets of Q values calculated by an online network and a target network, respectively. The online network in our study is the neural network of RL-GenRisk. The target network is designed to prevent the online network from overestimating Q values. The architecture of the target network is the same as that of the online network, but their parameters are different. In RL-GenRisk, the loss function is defined as follows:
where \({{{\mathcal{L}}}}\) represents the loss to be minimized. rt stands for the reward received at the step t, and γ is the discount factor. \({\max }_{{a}_{t+1}\in {{{{\bf{A}}}}}_{t+1}}Q({{{{\bf{s}}}}}_{t+1};{a}_{t+1};{\theta }^{{\prime} })\) represents the target network estimate of the maximum expected Q value for the next state st+1 and possible action at+1. At+1 represents the action space. Q(st; at; θ) represents the Q value calculated by the online network based on the current state st and action at. θ and \({\theta }^{{\prime} }\) represent the parameters of the online network and target network, respectively. θ is updated through gradient backward based on the loss function. Then \({\theta }^{{\prime} }\) is updated through soft updates based on the parameters of the online network as follows:
where τ represents a hyperparameter that controls the proportion of each update. More details about the hyperparameters in RL-GenRisk are shown in Supplementary Table 4. Moreover, we provided recommended ranges for each hyperparameter in Supplementary Table 4. Users can select hyperparameters within the recommended ranges. Additionally, users who wish to try a wider range of hyperparameter values can also use the grid search114 to select hyperparameter values.
In the identification process, RL-GenRisk differs from existing methods that predict outcomes following the same procedure as the training process23,29,31. Instead, capitalizing on the advantage of RL-GenRisk combined with reinforcement learning, we have designed a concise and effective identification process. Specifically, the model with the highest cumulative reward during the training process is selected as the best model and loaded. The sampled subgraph is initialized as empty, and all nodes in the PPI network are included in the action space at the beginning. The policy calculates Q values for all genes, and subsequently, the final ranking list of ccRCC risk genes is ordered based on these Q values. The higher Q value indicates higher risk. We analyzed the top 20 high-confidence risk genes in our study:
where HRGtop20 represents the top 20 high-confidence risk genes. g represents a gene and G represents the PPI network. Q(g) represents the Q value of gene g and Qtop20 represents the top 20 highest Q values in the set of all Q values calculated at the identification process. Researchers can select the number of top genes for verification based on actual verification costs.
Performance evaluation
Known ccRCC risk genes were sourced from the IntOGen cancer-specific database40, the Network of Cancer Genes (NCG) database54, the Cancer Gene Census (CGC) database55, and a cancer risk gene set derived from a recent comprehensive study by Bailey et al.56. The known ccRCC risk genes used in our study are provided in Supplementary Data 1. Due to the limited number of genes in the CGC (nine genes) and Bailey et al. (twelve genes) lists, we ultimately used three datasets to evaluate different methods: IntOGen, NCG, and a “Merged” dataset that combined IntOGen, NCG, Bailey et al., and CGC. First, the performance was evaluated using the Discounted Cumulative Gain (DCG), which has been used for evaluating cancer risk gene identification in the previous study29. The DCG is calculated as:
where N represents the number of the identified ccRCC risk genes. \(re{l}_{{g}_{i}}\) is equal to 1 if the i-th gene gi is contained in the known risk gene database, and 0 otherwise. Therefore, the higher the ranking of known ccRCC risk genes in the prediction results, the higher the DCG score. Following the previous study31, the top 100 identified ccRCC risk genes were considered in the performance evaluation. The top 100 genes identified by different methods are listed in Supplementary Data 2. Additionally, to provide a more comprehensive evaluation of different methods, we presented DCG curves and calculated the area under the DCG curve. We also used the normalized DCG (N-DCG) and the average precision (AP) as evaluation metrics. The N-DCG is calculated as:
where \(re{l}_{{g}_{i}}^{ideal}\) is the relevance score of the item at gi in the ideal ranking. The AP is calculated as:
where N represents the number of the ccRCC risk genes. P(i) represents the precision for the top i genes, which is calculated by dividing the number of known risk genes in identified genes by the total number of identified genes. \(re{l}_{{g}_{i}}\) is equal to 1 if the i-th gene gi is contained in the known risk gene database, and 0 otherwise.
Threshold selection for top K genes
To determine the optimal number of top K genes for downstream analysis, we evaluated the proportion of known ccRCC risk genes across various values of K (where K = 20, 30, 40, …, 100). We observed that when K was set to 20, the average proportion of known ccRCC risk genes exceeded 50% (Supplementary Fig. 19). As K increased, the proportion of known ccRCC risk genes decreased (Supplementary Fig. 19). Based on this observation, we suggested that selecting K = 20 generates a gene set with a higher likelihood of including unknown ccRCC risk genes. Additionally, considering the potential downstream analysis, selecting K = 20 kept the analysis cost within a manageable range in our study.
Statistical analysis
Gene set enrichment analysis
We used the g:Profiler61 for running functional enrichment analysis of the top 20 high-confidence risk genes identified by RL-GenRisk. g:Profiler maps genes to known functional information sources and detects statistically significantly enriched terms. We performed enrichment analysis on Gene Ontology, Human Phenotype Ontology, and WikiPathways. We used FDR p-value < 0.05 as the significance threshold.
Survival analysis
Clinical data of ccRCC patients and protein expression data of EGFR were obtained from TCGA. We used cSurvival115 to perform progression-free survival and disease-specific survival analysis on these data. Progression-free survival utilizes the time from randomization or initiation of treatment to the occurrence of disease progression or death77. Disease-specific survival refers to deaths caused specifically by a particular disease78. Patients were categorized into quartiles based on the expression levels of protein encoded by EGFR. The Kaplan-Meier estimator was used to generate survival curves, and the difference was assessed using the log-rank test116.
In vitro and in vivo experiments
Cell lines
ACHN and 786-O cells were purchased from Shanghai Zhong Qiao Xin Zhou Biotechnology Co., Ltd. (Shanghai, China), and were cultured in DMEM and RPMI-1640 medium (HyClone, Utah, USA), respectively, supplemented with 10% fetal bovine serum (FBS, Gibco, Australia) and 1% antibiotics (penicillin and streptomycin, HyClone) in a humidified incubator containing 5% CO2 at 37 °C. The stable cell lines of ACHN-shEGFR, 786-O-shEGFR, 786-O-EGFR, ACHN-shPCLO, and 786-O-shPCLO were obtained as described previously117.
High-content screening analysis
Cells were inoculated into a 96-well plate at an optimal density to ensure ideal growth and interaction. A high-content live-cell imaging system was utilized to monitor the cells at multiple time points, assessing various parameters. This imaging system, equipped with fluorescence microscopy capabilities, allows for real-time visualization of cell morphology and dynamics. At each time point, images were automatically captured, focusing on specific areas of interest within each well. The analysis included measuring cell perimeter, calculating movement speed and accumulated distance. The above analysis was done using Opera Phenix Plus and Harmony 5.2.
Western blot analysis
Total protein was extracted using TPER solution from Thermo Fisher Scientific. The protein concentration was measured using a BCA kit from Pierce. The proteins were separated using 10% SDS-PAGE. The proteins were transferred onto a PVDF membrane, and the membrane was blocked with 5% skimmed milk for 1 h at room temperature (RT). Then, the membrane was incubated with primary rabbit anti-human EGFR antibodies (R22778, ZenBio, WB: 1:1000), rabbit anti-PCLO antibody (HPA015858, Sigma-Aldrich, WB: 1:1000) and mouse anti-GAPDH antibody (250133, ZenBio, WB: 1:5000) at 4 °C overnight. After three washes with TBST for 10 min, the membrane was incubated with horseradish peroxidase (HRP)-conjugated goat antirabbit secondary antibodies (458, MBL) for 1 h at RT. Finally, the protein bands were detected using an HRP substrate. All experiments were performed in triplicate.
Quantitative polymerase chain reaction (qPCR)
Total RNA was extracted from the cells using a QIAGEN kit (74104), and the RNA concentration was determined using a NanoPhotometer from Implen (München). cDNA was synthesized using a RevertAid first-strand cDNA synthesis kit from Thermo Fisher Scientific (Waltham). PCR was performed using the following protocol: 98 °C for 2 min and 40 cycles at 98 °C for 5 s and 60 °C for 10 s. The relative gene expression was calculated using 2−ΔΔCt. β-Actin was used as an internal control. All experiments were performed more than triplicate.
Cell counting kit-8 assay
The cell counting kit-8 (CCK-8) assay was performed using a kit from Dojindo. A total of 3000 cells/well were cultured in a 96-well plate for 24 h, and 10 μL of CCK-8 solution was added to the wells. Then, the plate was incubated for 2 h at 37 °C, and the optical density (OD) value was detected at 450 nm using a microplate reader. All experiments were performed in triplicate.
Cell apoptosis
An Annexin V-Alexa Fluor 647/PI Kit was purchased from 4A Biotech (Suzhou, China). The cells were digested and washed twice with cold phosphate-buffered saline (PBS). Next, the 1 × binding buffer was used to suspend cells to a concentration of 1−5 × 106 cells/mL. Then 5 μL of Annexin V/Alexa Fluor 647 was added to 100 μL of cells, and the mixture was incubated at RT for 5 min in the dark. Finally, the flow cytometry assay was performed after adding 10 μL of 20 μg/mL propidium iodide (PI) and 400 μL of PBS. Figures illustrating the gating strategy are provided in the Supplementary Materials.
Transwell assay
Transwell migration assays were conducted using a Transwell chamber from Corning (REF 3422, Arizona, USA). Briefly, Transwell chambers were placed on a 24-well plate. Fresh medium containing 10% FBS in 600 μL was added to the lower chambers, and (2–5) × 104 cells in 200 μL of medium without FBS were added to the upper chamber. The 24-well plate was incubated at 37 °C for 48 h. Cells that invaded through the chamber were washed, fixed (20 min with 4% paraformaldehyde), and stained (30 min with crystal violet). Then, the upper chambers were washed, photographed, and preserved under an inverted fluorescence OBSERVER D1/AX10 cam HRC microscope (Zeiss). Transferred cells were analyzed using ImageJ software.
Colony formation assay
The cells (500/well) were seeded into a 6-well plate and cultured at 37 °C for 7–10 days. Then, the clones were imaged using a Celigo imaging cytometer from Nexcelom (Lawrence), and the clones were counted using ImageJ software. All experiments were performed in triplicate.
Animal study
Four-week-old male BALB/C-nu/nu mice were purchased from GemPharmatech. All these mice were maintained in pathogen-free conditions at 24 °C/50% humidity, with a light/dark cycle for 12 h, and given a free supply of food (reproductive diet, catalog number: F010201) and water. A total of 5 × 106 cells were inoculated into the right flank of mice, and the tumor volume was recorded every 3 days starting from day 17 after injection of the tumor cells. Mice were administered with erlotinib (S1023, Selleck, Shanghai, China) at a dose of 50 mg/kg. The tumor volume was calculated using the following equation: L × W2 × 0.5236, where L is tumor length and W is tumor width117. The maximum tumor diameter allowed by the Ethics Committee is 2 cm. We ensured that each time mice were sacrificed the maximal body weight loss did not exceed the limit of 20%. The animal procedures were approved by the ethics committee of West China Hospital, Sichuan University.
Statistical analysis
All data are presented as the mean ± standard error of the mean (SEM) or standard deviation (SD). Statistical significance for the comparison of multiple groups (>3) and between the groups was determined using analysis of variance (ANOVA) and Student’s paired t-test, respectively, in GraphPad Prism 9.0. p-value < 0.05 was considered statistically significant.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All datasets analyzed in this article are publicly available. The HPRD network is available at http://www.hprd.org/. STRING is available at https://string-db.org/. Multinet and IRefIndex are available at https://github.com/raphael-group/hotnet2/tree/master/paper/data/networks. HumanNet is available at https://staging2.inetbio.org/humannetv3/. PCNet is available at https://www.ndexbio.org/viewer/networks. The data of ccRCC patients from TCGA is available at https://portal.gdc.cancer.gov/. The single-cell RNA-seq data of ccRCC patients is available at https://singlecell.broadinstitute.org/. The GEO datasets of ccRCC patients are available at https://www.ncbi.nlm.nih.gov/geo/. Data for enrichment analysis is available at https://biit.cs.ut.ee/gprofiler/gost. Source data are provided with this paper.
Code availability
The source code of RL-GenRisk and the trained model can be downloaded from the GitHub repository at https://github.com/23AIBox/RL-GenRisk.
References
Padala, S. A. et al. Epidemiology of renal cell carcinoma. World J. Oncol. 11, 79 (2020).
Capitanio, U. & Montorsi, F. Renal cancer. Lancet 387, 894–906 (2016).
John, A., Spain, L. & Hamid, A. A. Navigating the current landscape of non-clear cell renal cell carcinoma: a review of the literature. Curr. Oncol. 30, 923–937 (2023).
Jonasch, E., Walker, C. L. & Rathmell, W. K. Clear cell renal cell carcinoma ontogeny and mechanisms of lethality. Nat. Rev. Nephrol. 17, 245–261 (2021).
Hsieh, J. J. et al. Renal cell carcinoma. Nat. Rev. Dis. Prim. 3, 1–19 (2017).
Sanchez, D. J. & Simon, M. C. Genetic and metabolic hallmarks of clear cell renal cell carcinoma. Biochimica et. Biophysica Acta (BBA)-Rev. Cancer 1870, 23–31 (2018).
Chowdhury, N. & Drake, C. G. Kidney cancer: an overview of current therapeutic approaches. Urologic Clin. 47, 419–431 (2020).
Cairns, P. Renal cell carcinoma. Cancer Biomark. 9, 461–473 (2011).
Del Vecchio, S. J. & Ellis, R. J. Cabozantinib for the management of metastatic clear cell renal cell carcinoma. J. Kidney Cancer VHL 5, 1 (2018).
Yoshida, K. et al. Efficacy of axitinib in patients with metastatic renal cell carcinoma refractory to nivolumab therapy. Jpn. J. Clin. Oncol. 49, 576–580 (2019).
Gao, B. et al. Screening novel drug candidates for kidney renal clear cell carcinoma treatment: A study on differentially expressed genes through the connectivity map database. Kidney Blood Press. Res. 46, 702–713 (2021).
Garraway, L. A. & Lander, E. S. Lessons from the cancer genome. Cell 153, 17–37 (2013).
Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Vogelstein, B. et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).
Anandakrishnan, R., Varghese, R. T., Kinney, N. A. & Garner, H. R. Estimating the number of genetic mutations (hits) required for carcinogenesis based on the distribution of somatic mutations. PLoS Computational Biol. 15, e1006881 (2019).
Martincorena, I. & Campbell, P. J. Somatic mutation in cancer and normal cells. Science 349, 1483–1489 (2015).
Tamborero, D. et al. Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci. Rep. 3, 2650 (2013).
Youn, A. & Simon, R. Identifying cancer driver genes in tumor genome sequencing studies. Bioinformatics 27, 175–181 (2011).
Dees, N. D. et al. Music: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012).
Tamborero, D., Gonzalez-Perez, A. & Lopez-Bigas, N. Oncodriveclust: exploiting the positional clustering of somatic mutations to identify cancer genes. Bioinformatics 29, 2238–2244 (2013).
Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
Andrades, R. & Recamonde-Mendoza, M. Machine learning methods for prediction of cancer driver genes: a survey paper. Brief. Bioinforma. 23, bbac062 (2022).
Stratton, M. R., Campbell, P. J. & Futreal, P. A. The cancer genome. Nature 458, 719–724 (2009).
Li, Y., Wang, C., Gao, Y. & Zhou, L. Identification and validation of pik3ca as a marker associated with prognosis and immune infiltration in renal clear cell carcinoma. J. Oncol. 2021, 1–18 (2021).
Guo, H. et al. The pi3k/akt pathway and renal cell carcinoma. J. Genet. genomics 42, 343–353 (2015).
Brugarolas, J. Molecular genetics of clear-cell renal cell carcinoma. J. Clin. Oncol. 32, 1968 (2014).
Cho, A. et al. Muffinn: cancer gene discovery via network analysis of somatic mutation data. Genome Biol. 17, 1–16 (2016).
Cutigi, J. F., Evangelista, A. F., Reis, R. M. & Simao, A. A computational approach for the discovery of significant cancer genes by weighted mutation and asymmetric spreading strength in networks. Sci. Rep. 11, 23551 (2021).
Leiserson, M. D. M., Vandin, F., Wu, H. T., Dobson, J. R. & Raphael, B. J. Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet. 47, 106–114 (2015).
Hristov, B. H. & Singh, M. Network-based coverage of mutational profiles reveals cancer genes. Cell Syst. 5, 221–229 (2017).
Shi, X. et al. Comprehensive evaluation of computational methods for predicting cancer driver genes. Brief. Bioinforma. 23, bbab548 (2022).
Tavanaei, A., Anandanadarajah, N., Maida, A. & Loganantharaj, R. A deep learning model for predicting tumor suppressor genes and oncogenes from pdb structure. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 613–617 (IEEE, 2017).
Agajanian, S., Oluyemi, O. & Verkhivker, G. M. Integration of random forest classifiers and deep convolutional neural networks for classification and biomolecular modeling of cancer driver mutations. Front. Mol. Biosci. 6, 44 (2019).
Luo, P., Ding, Y., Lei, X. & Wu, F.-X. deepdriver: predicting cancer driver genes based on somatic mutations using deep convolutional neural networks. Front. Genet. 10, 428627 (2019).
Schulte-Sasse, R., Budach, S., Hnisz, D. & Marsico, A. Graph convolutional networks improve the prediction of cancer driver genes. In Artificial Neural Networks and Machine Learning–ICANN 2019: Workshop and Special Sessions: 28th International Conference on Artificial Neural Networks, Munich, Germany, September 17–19, 2019, Proceedings 28, 658–668 (Springer, 2019).
Leiserson, M. D. et al. Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet. 47, 106–114 (2015).
Pezzicoli, G. et al. Genomic profiling and molecular characterization of clear cell renal cell carcinoma. Curr. Oncol. 30, 9276–9290 (2023).
Cui, H. et al. Identification of the key genes and pathways involved in the tumorigenesis and prognosis of kidney renal clear cell carcinoma. Sci. Rep. 10, 4271 (2020).
Martínez-Jiménez, F. et al. A compendium of mutational cancer driver genes. Nat. Rev. Cancer 20, 555–572 (2020).
Gershman, S. J. & Daw, N. D. Reinforcement learning and episodic memory in humans and animals: an integrative framework. Annu. Rev. Psychol. 68, 101–128 (2017).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. Proceedings of the 5th International Conference on Learning Representations (ICLR) (2017).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Bellman, R. A markovian decision process. J. Math. Mech. 6, 679–684 (1957).
Weinstein, J. N. et al. The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).
Reyna, M. A., Leiserson, M. D. & Raphael, B. J. Hierarchical hotnet: identifying hierarchies of altered subnetworks. Bioinformatics 34, i972–i980 (2018).
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Breiman. Random forests. Mach Learn. 45, 5-32 (2001).
Keshava Prasad, T. et al. Human protein reference database–2009 update. Nucleic acids Res. 37, D767–D772 (2009).
Szklarczyk, D. et al. String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids Res. 47, D607–D613 (2019).
Khurana, E., Fu, Y., Chen, J. & Gerstein, M. Interpretation of genomic variants using a unified biological network approach. PLoS Computational Biol. 9, e1002886 (2013).
Huang, J. K. et al. Systematic evaluation of gene networks for discovery of disease genes. Cancer Res. 78, 1310–1310 (2018).
Kim, C. Y. et al. Humannet v3: an improved database of human gene networks for disease research. Nucleic acids Res. 50, D632–D639 (2022).
Dressler, L. et al. Comparative assessment of genes driving cancer and somatic evolution in non-cancer tissues: an update of the network of cancer genes (ncg) resource. Genome Biol. 23, 35 (2022).
Sondka, Z. et al. The cosmic cancer gene census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer 18, 696–705 (2018).
Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385.e18 (2018).
Järvelin, K. & Kekäläinen, J. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst. (TOIS) 20, 422–446 (2002).
Mendiratta, G. et al. Cancer gene mutation frequencies for the us population. Nat. Commun. 12, 5961 (2021).
Mayakonda, A., Lin, D.-C., Assenov, Y., Plass, C. & Koeffler, H. P. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 28, 1747–1756 (2018).
Martens, M. et al. Wikipathways: connecting communities. Nucleic acids Res. 49, D613–D621 (2021).
Kolberg, L. et al. g: Profiler–interoperable web service for functional enrichment analysis and gene identifier mapping (2023 update). Nucleic acids Res. 51, W207–W212 (2023).
Köhler, S. et al. The human phenotype ontology in 2021. Nucleic acids Res. 49, D1207–D1217 (2021).
Jaccard, P. Nouvelles recherches sur la distribution florale. Bull. Soc. Vaud. Sci. Nat. 44, 223–270 (1908).
Ritchie, M. E. et al. limma powers differential expression analyses for rna-sequencing and microarray studies. Nucleic acids Res. 43, e47–e47 (2015).
Martinelli, E., Morgillo, F., Troiani, T. & Ciardiello, F. Cancer resistance to therapies against the egfr-ras-raf pathway: The role of mek. Cancer Treat. Rev. 53, 61–69 (2017).
Jacobsen, K. et al. Convergent akt activation drives acquired egfr inhibitor resistance in lung cancer. Nat. Commun. 8, 410 (2017).
Farooqi, A. A. et al. Multifunctional roles of betulinic acid in cancer chemoprevention: Spotlight on jak/stat, vegf, egf/egfr, trail/trail-r, akt/mtor and non-coding rnas in the inhibition of carcinogenesis and metastasis. Molecules 28, 67 (2022).
Matthews, H. K., Bertoli, C. & de Bruin, R. A. Cell cycle control in cancer. Nat. Rev. Mol. Cell Biol. 23, 74–88 (2022).
Frattini, M., Molinari, F. & Epistolio, S. The role of piccolo in cancer treatment: relationship with egfr and related therapies, and a marker for new targeted therapies. J. Thorac. Dis. 9, 4240 (2017).
Zhang, W. et al. Piccolo mediates egfr signaling and acts as a prognostic biomarker in esophageal squamous cell carcinoma. Oncogene 36, 3890–3902 (2017).
Bernhardt, M., Behrens, H.-M., Krüger, S. & Röcken, C. Exploration of the tumor biological significance of pclo in gastric cancer-results from a large central European cohort. Pathobiol: J. Immunopathol. Mol. Cell. Biol. 91, 187–195 (2023).
Eckel-Passow, J. E. et al. Anks1b is a smoking-related molecular alteration in clear cell renal cell carcinoma. BMC Urol. 14, 1–9 (2014).
Peña-Llopis, S. et al. Bap1 loss defines a new class of renal cell carcinoma. Nat. Genet. 44, 751–759 (2012).
Fu, H. et al. Low expression of mucin-4 predicts poor prognosis in patients with clear-cell renal cell carcinoma. Medicine 95, e3225 (2016).
Shinagare, A. B. et al. Radiogenomics of clear cell renal cell carcinoma: Preliminary findings of the cancer genome atlas–renal cell carcinoma (tcga–rcc) imaging research group. Abdom. imaging 40, 1684–1692 (2015).
Bi, K. et al. Tumor and immune reprogramming during immunotherapy in advanced renal cell carcinoma. Cancer cell 39, 649–661 (2021).
Saad, E. & Katz, A. Progression-free survival and time to progression as primary end points in advanced breast cancer: often used, sometimes loosely defined. Ann. Oncol. 20, 460–464 (2009).
Marcadis, A. R. et al. Characterizing relative and disease-specific survival in early-stage cancers. JAMA Intern. Med. 180, 461–463 (2020).
Tong, Y. et al. Vhl regulates the sensitivity of clear cell renal cell carcinoma to sirt4-mediated metabolic stress via hif-1α/ho-1 pathway. Cell Death Dis. 12, 621 (2021).
Hu, J. et al. Tumor heterogeneity in vhl drives metastasis in clear cell renal cell carcinoma. Signal Transduct. Target. Ther. 8, 155 (2023).
Liu, X.-D. et al. Pbrm1 loss defines a nonimmunogenic tumor phenotype associated with checkpoint inhibitor resistance in renal carcinoma. Nat. Commun. 11, 2135 (2020).
Aili, A., Wen, J., Xue, L. & Wang, J. Mutational analysis of pbrm1 and significance of pbrm1 mutation in anti-pd-1 immunotherapy of clear cell renal cell carcinoma. Front. Oncol. 11, 712765 (2021).
Xie, Y. et al. Setd2 loss perturbs the kidney cancer epigenetic landscape to promote metastasis and engenders actionable dependencies on histone chaperone complexes. Nat. cancer 3, 188–202 (2022).
Rao, H. et al. Setd2 deficiency accelerates sphingomyelin accumulation and promotes the development of renal cancer. Nat. Commun. 14, 7572 (2023).
Tan, G. et al. The critical role of bap1 mutation in the prognosis and treatment selection of kidney renal clear cell carcinoma. Transl. Androl. Urol. 9, 1725 (2020).
Feng, Z. et al. Identifying bap1 mutations in clear-cell renal cell carcinoma by ct radiomics: preliminary findings. Front. Oncol. 10, 279 (2020).
DiNatale, R. G., Sanchez, A., Hakimi, A. A. & Reznik, E. Metabolomics informs common patterns of molecular dysfunction across histologies of renal cell carcinoma. In Urologic Oncology: Seminars and Original Investigations, 38, 755-762 (Elsevier, 2020).
Roldan-Romero, J. M. et al. Pten expression and mutations in tsc1, tsc2 and mtor are associated with response to rapalogs in patients with renal cell carcinoma. Int. J. Cancer 146, 1435–1444 (2020).
Selvarajah, J., Nathawat, K., Moumen, A., Ashcroft, M. & Carroll, V. Chemotherapy-mediated p53-dependent dna damage response in clear cell renal cell carcinoma: role of the mtorc1/2 and hypoxia-inducible factor pathways. Cell death Dis. 4, e865 (2013).
Shu, X. et al. Germline genetic variants in somatically significantly mutated genes in tumors are associated with renal cell carcinoma risk and outcome. Carcinogenesis 39, 752–757 (2018).
Li, Y.-D. et al. Pan-cancer analysis identifies spen mutation as a predictive biomarker with the efficacy of immunotherapy. BMC cancer 23, 793 (2023).
Bi, M. et al. Genomic characterization of sarcomatoid transformation in clear cell renal cell carcinoma. Proceedings of the National Academy of Sciences of the United States of Americae920-e920 (2016).
Pei, J., Zhang, J. & Cong, Q. Computational analysis of protein–protein interactions of cancer drivers in renal cell carcinoma. FEBS Open Bio 14, 112–126 (2024).
Guerrero-Martínez, J. A. & Reyes, J. C. High expression of smarca4 or smarca2 is frequently associated with an opposite prognosis in cancer. Sci. Rep. 8, 2043 (2018).
Cui, H. et al. The swi/snf chromatin remodeling factor dpf3 regulates metastasis of ccrcc by modulating tgf-β signaling. Nat. Commun. 13, 4680 (2022).
Park, S. E. et al. mir-96-5p targets pten to mediate sunitinib resistance in clear cell renal cell carcinoma. Sci. Rep. 12, 3537 (2022).
Lee, H. J. et al. Prognostic significance of biallelic loss of pten in clear cell renal cell carcinoma. J. Urol. 192, 940–946 (2014).
Li, Q. et al. Pan-cancer analysis of the pde4dip gene with potential prognostic and immunotherapeutic values in multiple cancers including acute myeloid leukemia. Open Med. 18, 20230782 (2023).
Lin, P. et al. Radiomic profiling of clear cell renal cell carcinoma reveals subtypes with distinct prognoses and molecular pathways. Transl. Oncol. 14, 101078 (2021).
Gong, Z. et al. Integrative analysis of trpv family to prognosis and immune infiltration in renal clear cell carcinoma. Channels 16, 84–96 (2022).
Sun, Z. et al. Investigating the prognostic role of lncrnas associated with disulfidptosis-related genes in clear cell renal cell carcinoma. J. Gene Med. 26, e3608 (2024).
Wang, Z. et al. The role of erbb signaling pathway-related genes in kidney renal clear cell carcinoma and establishing a prognostic risk assessment model for patients. Front. Genet. 13, 862210 (2022).
He, Y. et al. Targeting pi3k/akt signal transduction for cancer therapy. Signal Transduct. Target. Ther. 6, 425 (2021).
Braicu, C. et al. A comprehensive review on mapk: a promising therapeutic target in cancer. Cancers 11, 1618 (2019).
Suchitha, G., Balaya, R. D. A., Raju, R., Keshava Prasad, T. & Dagamajalu, S. A network map of cytoskeleton-associated protein 4 (ckap4) mediated signaling pathway in cancer. J. Cell Commun. Signal. 17, 1097–1104 (2023).
Zhou, L. et al. Targeting met and axl overcomes resistance to sunitinib therapy in renal cell carcinoma. Oncogene 35, 2687–2697 (2016).
Ruan, H., Li, S., Bao, L. & Zhang, X. Enhanced yb1/epha2 axis signaling promotes acquired resistance to sunitinib and metastatic potential in renal cell carcinoma. Oncogene 39, 6113–6128 (2020).
Lee, S. J. et al. Von hippel-lindau tumor suppressor gene loss in renal cell carcinoma promotes oncogenic epidermal growth factor receptor signaling via akt-1 and mek-1. Eur. Urol. 54, 845–854 (2008).
Wen, Y.-C. et al. The oncogenic adamts1–vcan–egfr cyclic axis drives anoikis resistance and invasion in renal cell carcinoma. Cell. Mol. Biol. Lett. 29, 126 (2024).
Schubach, M., Maass, T., Nazaretyan, L., Röner, S. & Kircher, M. Cadd v1. 7: using protein language models, regulatory cnns and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic acids Res. 52, D1143–D1154 (2024).
Cao, M. et al. Going the distance for protein function prediction: a new distance metric for protein interaction networks. PloS one 8, e76339 (2013).
Ten Kathen, M. J., Reina, D. G. & Flores, I. J. A comparison of pso-based informative path planners for detecting pollution peaks of the ypacarai lake with autonomous surface vehicles. In International Conference on Optimization and Learning OLA, vol. 2022 (2022).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778 (2016).
Bergstra, J. & Bengio, Y. Random search for hyper-parameter optimization. J. Machine Learning Res. 13, 281–305 (2012).
Cheng, X. et al. csurvival: a web resource for biomarker interactions in cancer outcomes and in cell lines. Brief. Bioinforma. 23, bbac090 (2022).
Mantel, N. et al. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother. Rep. 50, 163–170 (1966).
Zheng, X. et al. Vinculin orchestrates prostate cancer progression by regulating tumor cell invasion, migration, and proliferation. Prostate 81, 347–356 (2021).
Acknowledgements
This study was supported by the National Natural Science Foundation of China (92370132 to J.H., 62072376 to J.P., 82070784 to J.A., and 62106172 to Y.Z.), the National Key R&D Program of China (2023YFC3403200 to J.A.), and the Xiaomi Young Talents Program of Xiaomi Foundation to Y.Z.
Author information
Authors and Affiliations
Contributions
Conceptualization, J.P., J.H. and J.A.; methodology, J.P., J.H., Y.Z., D.L., Z.L. and L.H.; experimentation, D.L., Y.Z., Z.L., L.H., X.Y. and J.A.; writing-original draft, D.L., X.Y, and J.A.; writing–review and editing, J.P., J.H., J.A., D.L., Y.Z., X.Y, X.Z., B.J. and S.J.; supervision, J.P., J.H. and J.A.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Lu, D., Zheng, Y., Yi, X. et al. Identifying potential risk genes for clear cell renal cell carcinoma with deep reinforcement learning. Nat Commun 16, 3591 (2025). https://doi.org/10.1038/s41467-025-58439-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-025-58439-5