Identifying potential risk genes for clear cell renal cell carcinoma with deep reinforcement learning

Lu, Dazhi; Zheng, Yan; Yi, Xianyanling; Hao, Jianye; Zeng, Xi; Han, Lu; Li, Zhigang; Jiao, Shaoqing; Jiang, Bei; Ai, Jianzhong; Peng, Jiajie

doi:10.1038/s41467-025-58439-5

Download PDF

Article
Open access
Published: 15 April 2025

Identifying potential risk genes for clear cell renal cell carcinoma with deep reinforcement learning

Nature Communications volume 16, Article number: 3591 (2025) Cite this article

6419 Accesses
1 Citations
11 Altmetric
Metrics details

Subjects

Abstract

Clear cell renal cell carcinoma (ccRCC) is the most prevalent type of renal cell carcinoma. However, our understanding of ccRCC risk genes remains limited. This gap in knowledge poses challenges to the effective diagnosis and treatment of ccRCC. To address this problem, we propose a deep reinforcement learning-based computational approach named RL-GenRisk to identify ccRCC risk genes. Distinct from traditional supervised models, RL-GenRisk frames the identification of ccRCC risk genes as a Markov Decision Process, combining the graph convolutional network and Deep Q-Network for risk gene identification. Moreover, a well-designed data-driven reward is proposed for mitigating the limitation of scant known risk genes. The evaluation demonstrates that RL-GenRisk outperforms existing methods in ccRCC risk gene identification. Additionally, RL-GenRisk identifies eight potential ccRCC risk genes. We successfully validated epidermal growth factor receptor (EGFR) and piccolo presynaptic cytomatrix protein (PCLO), corroborated through independent datasets and biological experimentation. This approach may also be used for other diseases in the future.

Prognostic value of KLFs family genes in renal clear cell carcinoma

Article Open access 30 August 2024

Whole genome sequencing refines stratification and therapy of patients with clear cell renal cell carcinoma

Article Open access 15 July 2024

Predicting response of immunotherapy and targeted therapy and prognosis characteristics for renal clear cell carcinoma based on m1A methylation regulators

Article Open access 04 August 2023

Introduction

Renal cell carcinoma (RCC), one of the most common cancers worldwide, is a type of kidney cancer that initiates in the lining of the proximal convoluted tubule^1,2. Clear cell renal cell carcinoma (ccRCC) constitutes 80% of all RCC cases and is particularly aggressive due to its high immune infiltration^3,4,5. In addition, over 30% of ccRCC patients suffer from metastasis, which is a significant factor leading to death in ccRCC patients^6,7,8. Although several drugs have been utilized for the treatment of ccRCC, the efficacy is still limited due to the heterogeneity of ccRCC^9,10,11. Therefore, it is necessary to understand the pathogenesis and identify risk genes of ccRCC, which may be beneficial for early diagnosis and treatment of ccRCC^12,13,14.

Cancer is a complex genetic disorder. Its occurrence and progression are associated with the accumulation of driver genetic mutations that provide a selective growth advantage to cells^15,16. Consequently, one class of methods to identify cancer risk genes is based on mutation data. In the past years, several cancer sequencing projects have generated mutation data from thousands of cancer patients, enhancing the identification of cancer risk genes^15,17. Traditional statistical approaches focus on genes with a higher mutation frequency in the patient cohort than the control cohort¹⁸. Youn et al.¹⁹ employed the functional impact of mutations on proteins, variations in background mutation rates among tumors, and the redundancy of the genetic code in tumor genome sequencing data to identify key genes in non-small cell lung cancer. Methods like MuSiC²⁰, OncodriveCLUST²¹, and MutSigCV²² identified cancer risk genes by comparing the observed gene mutations with the predefined background mutation frequency. By now, frequency-based methods have identified many cancer risk genes and enhanced cancer diagnosis and therapy²³. However, the genetic foundations of cancer are highly diverse. Except for genes mutated across a large number of patients, some key genes in tumor initiation and progression are observed to be mutated in only a few patients²⁴. For example, PIK3CA, which has been validated to be a ccRCC risk gene by previous studies^25,26,27, is mutated in no more than 5% of ccRCC patients²⁷. It is difficult for purely frequency-based approaches to identify these genes with low mutation frequency but high risk.

To address the drawback of frequency-based methods, the interactions among proteins are introduced for cancer risk gene identification, since genes involved in the same signaling and regulatory pathways as well as protein complexes may interact to exert their effects together. Muffinn²⁸ identified cancer risk genes through network propagation, taking into account mutations not only in individual genes but also in their neighbors within the protein-protein interaction (PPI) network. DiSCaGe²⁹ calculated a gene mutation score using an asymmetric spreading strength based on the type of mutations and the PPI network, then produced a ranking of prioritized cancer risk genes. HotNet2³⁰ used an insulated heat diffusion process to identify cancer risk genes by propagating heat through the PPI network. nCOP³¹ employed a heuristic search method to select connected subnetworks from the PPI network based on the mutation data of cancer patients, and then ranked cancer risk genes based on the frequencies of genes appearing in these subnetworks. Both aforementioned methods are unsupervised, which may suffer from the highly diverse genetic foundations of cancer or the noise in the PPI network³². Recently, several supervised methods have emerged as potentially valuable tools for predicting cancer risk genes^{23,33,34,35,36}. For example, Agajanian Steve et al.³⁴ trained a random forest classifier to identify cancer risk genes based on the known cancer-driver mutations. DeepDriver³⁵ used gene mutation types as features, constructed a K-nearest neighbor graph based on the Pearson correlation coefficient, and trained a convolutional neural network to identify cancer risk genes. Nevertheless, different from unsupervised methods^28,29,31,37, supervised methods require a substantial amount of known high-confidence risk genes as labeled data for model training²³. Unfortunately, the number of known high-confidence ccRCC risk genes is currently limited^38,39. For example, there are only 44 ccRCC risk genes in the IntOGen database⁴⁰. Owing to the reliance on labels, predicting ccRCC risk genes using supervised methods is challenging.

To overcome the limitations of existing methods, we propose a deep reinforcement learning-based approach for ccRCC risk gene identification, named RL-GenRisk (Reinforcement Learning-based GENe RISK). The reinforcement learning-based model leverages environmental interactions for optimization⁴¹, tackling the challenge of scant known risk genes. Specifically, RL-GenRisk models the PPI network as the environment and utilizes a graph convolutional network⁴² to learn state representations. It also incorporates the Deep Q-Network (DQN)⁴³ to combine reinforcement learning with deep neural networks for ccRCC risk gene identification. Moreover, a data-driven reward is designed to facilitate a straightforward method for identifying ccRCC risk genes. By focusing on a sampled subgraph with node features, the data-driven reward effectively leverages information from both the PPI network and gene mutation data. This not only ensures the accurate identification of genes with high mutation frequencies but also enables RL-GenRisk to identify potential risk genes with low mutation frequencies that functionally interact with genes having high mutation frequencies. Extensive experiments demonstrate that RL-GenRisk outperforms the existing methods in the identification of ccRCC risk genes. Furthermore, several potential risk genes are revealed and validated in independent datasets. Specifically, we validated two top-rank genes EGFR and PCLO through statistical and biological experiments. Statistical analyses show significant upregulation of EGFR at both bulk and single-cell levels among ccRCC patients, with a significant association between overexpression of the protein encoded by EGFR and poor survival in ccRCC patients. The in vitro experimental results show that decreased EGFR expression promotes ccRCC cell apoptosis as well as suppresses colony formation and migration, and the use of the EGFR inhibitor erlotinib effectively augments apoptosis and inhibits migration. Moreover, knocking down PCLO expression in vitro significantly inhibited ccRCC progression. Additionally, the in vivo experimental results show that both the erlotinib and EGFR downregulation can significantly repress the growth of ccRCC tumors in mice.

Results

RL-GenRisk framework

We propose a deep reinforcement learning-based approach to identify ccRCC risk genes (Fig. 1), named RL-GenRisk. Fundamentally different from existing supervised deep learning-based methods, RL-GenRisk incorporates the reinforcement learning paradigm. Specifically, RL-GenRisk frames the ccRCC risk gene identification as a sequence decision-making process, formulated as a Markov Decision Process⁴⁴. This enables RL-GenRisk to integrate reinforcement learning algorithms seamlessly, thereby effectively addressing the inherent challenge of scant known risk genes. RL-GenRisk takes PPI network and gene mutation data as input (Fig. 1A). The PPI network is represented as an undirected graph, with nodes representing genes and edges representing interactions between genes. Gene mutation data includes details on the presence of mutations in ccRCC patients for each gene. RL-GenRisk treats the PPI network with gene mutation information as the environment. The state includes a sampled subgraph with node features. The action is selecting a node directly connected to the sampled subgraph and adding it to this sampled subgraph. Thus, the risk gene identification is framed as a Markov Decision Process of node selection within the PPI network.

Central to RL-GenRisk is a policy that is represented by a neural network and interacts with the environment. The policy takes the current state as input and predicts Q values, representing the probability distribution of all possible actions. The policy of RL-GenRisk consists of two main components: a Graph Convolutional Network (GCN) for learning state representation and a node evaluation network for computing action probability (Fig. 1B top). To enhance state representation, RL-GenRisk employs the GCN⁴², an inductive graph representation learning method, to capture node representations in the PPI network. Node initial features are derived from the PPI network’s topology information and ccRCC patients’ mutation information. The policy of RL-GenRisk is trained to select the optimal actions by maximizing the reward. In this study, we designed a data-driven reward that focused on the sampled subgraph, considering both information from the PPI network and gene mutation data. The DQN algorithm is employed to update the policy’s parameters. Throughout the training, we employed the ϵ-greedy strategy to choose actions based on Q values, thus boosting RL-GenRisk’s exploratory potential. In the identification phase (Fig. 1B bottom), RL-GenRisk starts with an empty subgraph, incorporates all nodes into the action space, and utilizes the trained policy to calculate Q values for each node. Subsequently, ccRCC risk genes are ranked by Q values, with higher values indicating greater risk. Further details about the RL-GenRisk can be found in the “Methods” section.

RL-GenRisk shows superior performance for ccRCC risk gene prioritization over existing methods

To assess the performance of RL-GenRisk, we utilized RL-GenRisk and eight other existing methods to identify ccRCC genes with gene mutation data of ccRCC patients from The Cancer Genome Atlas (TCGA)⁴⁵ and five different PPI networks. Six of these eight existing methods are specifically designed for cancer risk gene identification, including nCOP³¹, DiSCaGe²⁹, Hierarchical HotNet⁴⁶, HotNet2³⁰, Muffinn²⁸, and MutSigCV²². The other two are SVM⁴⁷ and Random Forest⁴⁸, which are widely used supervised machine learning methods. The implementation details of SVM and Random Forest are provided in the Supplementary Materials. The PPI networks used in our study included HPRD⁴⁹, STRING⁵⁰, Multinet⁵¹, IRefIndex⁵², and HumanNet⁵³. Known ccRCC risk genes were retrieved from the IntOGen⁴⁰ cancer-specific database, the Network of Cancer Genes (NCG) database⁵⁴, the Cancer Gene Census (CGC) database⁵⁵, and a cancer risk gene set extracted from the recent and extensive study conducted by Bailey et al.⁵⁶. In particular, the ccRCC risk genes included in CGC and Baylei et al. datasets are limited, containing only nine and twelve genes respectively. Consequently, we evaluated different methods using three datasets: IntOGen, NCG, and a “Merged” dataset that combined data from IntOGen, NCG, Baylei et al. and CGC. The genes included in these datasets are provided in Supplementary Data 1. To assess the performance of the different methods, we used discounted cumulative gain (DCG) as one of the primary evaluation metrics, consistent with previous studies^29,57. Specifically, the DCG scores were calculated based on the top 100 genes identified by each method. Moreover, the DCG curves of different methods on the IntOGen, NCG, and the Merged ccRCC risk gene dataset are shown in Fig. 2 and Supplementary Fig. 2. Following this, we calculated the normalized DCG (N-DCG), the area under the DCG curve (DCG-AUC), and the average precision (AP) to further evaluate these methods’ performance (as shown in Fig. 2D–F, and Supplementary Fig. 3). Evaluation results indicated that RL-GenRisk outperformed the eight other established methods in identifying ccRCC risk genes, achieving the highest DCG, N-DCG, DCG-AUC, and AP on all used datasets (as shown in Supplementary Fig. 1). As anticipated, MutsigCV showed the lowest performance among the six existing methods specifically designed for cancer risk gene identification due to its reliance solely on mutation data, lacking integration with biological network insights. Muffinn performed well with STRING but showed inferior performance with other PPI networks, highlighting its instability across different PPI networks (Supplementary Fig. 3). While nCOP and Hierarchical HotNet performed well across different datasets and showed more stability across various PPI networks, their performance was inferior to RL-GenRisk (Supplementary Fig. 1 and Supplementary Fig. 3). Additionally, SVM showed the lowest performance among the compared methods. Random Forest performed better than SVM and MutSigCV. However, its performance was inferior to RL-GenRisk. RL-GenRisk achieved the best performance on HPRD (as shown in Supplementary Fig. 1). Therefore, the subsequent analyses were grounded on the high-confidence risk genes (HRGs) identified by the best-performing method, RL-GenRisk with the HPRD network.

**Fig. 2: Performance comparison and perturbation analysis.**

Biological network facilitates the identification of low-frequency mutated ccRCC risk genes

Although gene mutation frequency is crucial for assessing cancer associations, not all cancer-related genes have high mutation frequencies⁵⁸. Therefore, relying solely on mutation data might overlook low mutation frequency, high-risk cancer genes. To evaluate the capability of RL-GenRisk in identifying ccRCC risk genes with low mutation frequency but high risk, we utilized the “maftools” package⁵⁹ to analyze the mutation frequencies of the top 20 HRGs (Supplementary Table 1) identified by RL-GenRisk in ccRCC patients from TCGA (Supplementary Fig. 5A). Generally, 317 (77.13%) of the 411 ccRCC patients from TCGA exhibited somatic mutations in the top 20 HRGs. Missense mutations were predominant, followed by frameshift mutations, among the identified somatic mutation types. Notably, the mutation frequencies of four genes, including VHL, PBRM1, SETD2, and BAP, exceeded 10%. These genes are already recognized as ccRCC risk genes and included in the IntOGen database. Genes with high mutation frequencies are more likely to be detected by methods that solely rely on mutation data. However, it’s important to note that not all cancer genes have high mutation frequencies⁵⁸. Among the top 20 HRGs, seven recognized ccRCC risk genes in the IntOGen database had mutation frequencies below 5%. For example, only 5 of 411 ccRCC patients carry mutations in PIK3CA, a known ccRCC risk gene in the IntOGen database, indicating variable mutation frequencies among cancer genes. To assess the contribution of the biological network in RL-GenRisk for identifying low mutation frequency, high-risk ccRCC genes, we progressively increased the percentage of edges randomly swapped between node pairs from 25% up to 50%, 75%, and 100%. Perturbing the PPI network disrupts its internal information, with more extensive perturbation causing greater information loss. As the perturbation ratio increased, the rankings of known ccRCC genes that initially had high ranks and mutation frequencies below 5% gradually dropped (as shown in Supplementary Fig. 5B). We then compared the mutation frequencies of the top 20 HRGs identified by RL-GenRisk before and after PPI network perturbation and created dot plots for visualization (Fig. 2G, H). We found that after network perturbation, three known ccRCC risk genes, TP53, PIK3CA, and SPEN, previously identified in the top 20 with mutation frequencies below 5%, dropped out of the top 20. This indicates that incorporating PPI network knowledge improves the detection of low mutation frequency ccRCC genes with significant risk implications.

Biological function analysis of high-confidence risk genes

To delve into the biological function of HRGs, we conducted pathway enrichment analysis using the top 20 HRGs and the WikiPathways database⁶⁰, a comprehensive resource for pathway-based data analysis. Enrichment analysis was performed using g:Profiler⁶¹. Figure 3. A illustrated the top 10 significantly enriched pathways (FDR p-value < = 1.21e-4). The results demonstrated significant enrichment in various cancer-related pathways, with “Clear cell renal cell carcinoma pathways” being the most prominently enriched (FDR p-value = 1.48e-8). The Human Phenotype Ontology (HPO) database⁶², which provides a standardized vocabulary of phenotypic abnormalities associated with human diseases, highlighted “Renal neoplasm” and “Renal cell carcinoma” as the most significantly enriched phenotypes (FDR p-value < 4.05e-5, Supplementary Fig. 6). Furthermore, Gene Ontology (GO) enrichment analysis indicated that top 20 HRGs were notably enriched in various cancer-related biological processes, including “cell adhesion” (FDR p-value = 2.69e-4), “regulation of cell population proliferation” (FDR p-value = 3.64e-4), “cell population proliferation” (FDR p-value = 6.85e-4), and “cell migration” (FDR p-value = 2.39e-3). These observations showed a comprehensive molecular landscape linked to HRGs within the scope of ccRCC, emphasizing the critical roles of these genes in ccRCC-related mechanisms.

**Fig. 3: Independent datasets analysis.**

In addition, to further explore whether other top-performing methods identified similar pathways albeit with different genes, we performed an analysis based on the pathways enriched by the top 20 genes identified by top-performing methods, including RL-GenRisk, nCOP, Hierarchical HotNet, HotNet2, DiSCaGe, Muffinn, and Random Forest. We first calculated the Jaccard similarity coefficients⁶³ between the top 20 gene lists identified by these methods (pairwise comparisons among these 7 methods, totaling 21 pairs). The results showed that 76% (16 out of 21 pairs) of method pairs have a Jaccard similarity coefficient below 0.5 for their identified top 20 genes, indicating that the top 20 gene lists identified by these methods do not exhibit a high degree of overlap overall (Supplementary Fig. 22). After that, we analyzed the intersection of pathways enriched by the top 20 genes identified by top-performing methods. The result revealed that two pathways were significantly enriched (FDR p-value < 0.05) across these methods, including “Clear cell renal cell carcinoma pathway” and “Type 2 papillary renal cell carcinoma”. Notably, the top 20 genes identified by RL-GenRisk exhibited the most significant enrichment in the “Clear cell renal cell carcinoma pathway” among all these methods (as illustrated in Supplementary Fig. 7). The detailed results of enrichment analysis for different methods are provided in Supplementary Data 3.

Differential expression analysis revealed significant differential gene expression of EGFR at both bulk and single-cell levels

Among the top 20 HRGs identified by RL-GenRisk, 8 HRGs are not included in these ccRCC risk gene databases. We then performed differential expression analysis on these 8 HRGs using RNA-seq data from ccRCC patients in TCGA. Differential expression analysis was performed using Limma⁶⁴, and the significance threshold was FDR p-value < 0.05. Notably, EGFR and PCLO showed significant differential expression between tumor tissues and normal tissues in TCGA among these 8 HRGs (FDR p-value = 2.07e-38 for EGFR and FDR p-value = 1.04e-14 for PCLO, Fig. 3B and Supplementary Fig. 8). The differential expression results for these 8 HRGs are shown in Supplementary Fig. 8. EGFR was notably upregulated in ccRCC tumor tissues compared to normal tissues. EGFR is activated as a homodimer or heterodimer, thereby regulating multiple signaling pathways, including the RAS/RAF/MAPK, AKT, and JAK/STAT pathways, which play essential roles in cell migration, proliferation, and survival^65,66,67. These pathways are intricately involved in driving cell proliferation and conferring resistance to apoptosis⁶⁷. The overexpression of EGFR leads to an excess of receptors on the cell surface, fostering uncontrolled cell growth and division. This dysregulation can drive the transformation of normal cells into tumor cells, creating a favorable environment for sustained tumor cell survival⁶⁸. Additionally, the protein encoded by PCLO is a component of the presynaptic cytoskeletal matrix, which is involved in establishing active synaptic zones and in synaptic vesicle trafficking⁶⁹. Recent research has identified the expression level of PCLO as a prognostic biomarker for esophageal squamous cell carcinoma⁷⁰, and a notable high mutation frequency (47.9%) of PCLO has been observed in large Central European cohorts with gastric cancer⁷¹. In addition, to further explore the transcriptional change of these 8 HRGs across tumor and normal tissues, we conducted differential expression analysis using ccRCC patient data from the Gene Expression Omnibus (GEO). We used two mRNA expression profiles of ccRCC patients from GSE46699⁷² and GSE36895⁷³ for analysis. The GSE46699 contains data from 67 tumor tissues and 63 normal tissues. The GSE36895 contains data from 29 tumor tissues and 23 normal tissues. We found that among these 8 HRGs, 5 genes showed significantly differential expression in both GSE46699 and GSE36895 datasets (FDR p-value < 0.05, as shown in Supplementary Fig. 9). In particular, MUC4 shows high expression in tumor tissues in the GSE36895 dataset, but low expression in tumor tissues in the GSE46699 dataset. Existing studies reported that MUC4 expression is an independent prognostic factor for overall survival in ccRCC patients⁷⁴, and MUC4 mutation is associated with an exophytic growth pattern of ccRCC⁷⁵. Then, we compared the differential expression levels of the top 20 ccRCC risk genes identified by RL-GenRisk and other methods using the GSE46699 and GSE36895 datasets. The results indicated that the top 20 genes identified by RL-GenRisk included a higher number (14) of genes that were significantly differentially expressed in both GSE46699 and GSE36895 (FDR p-value < 0.05). Further details can be found in Supplementary Data 4.

To further investigate the expression patterns of the top 20 HRGs in tumor cells and normal cells of ccRCC patients, we performed an analysis utilizing single-cell RNA-seq data from ccRCC patients reported in a recent work⁷⁶ (Supplementary Table 2). Uniform manifold approximation and projection (UMAP) was used to visualize the distribution of 31,856 cells from kidneys in these patients in a two-dimensional plane (Fig. 3E). We used UMAP to visualize the expression of the top 20 HRGs in different cells (Supplementary Fig. 10) and plotted the distribution of expression of these genes in different cell types (Supplementary Fig. 11). Subsequently, we assessed the expression levels of EGFR across different cell types within these patients (Fig. 3F and Supplementary Fig. 11). Our investigation showed a significant difference in EGFR expression between tumor cells and other cells, (FDR p-value < = 1.21e-4, Wilcoxon signed-rank test, Supplementary Fig. 12), with displaying higher expression levels in tumor cells (Fig. 3G). Therefore, the significant upregulation of EGFR at both bulk and single-cell levels in ccRCC patients suggested its potential as a biomarker for ccRCC.

Expression level of protein encoded by the EGFR is significantly correlated with the prognosis of ccRCC patients

To reveal whether EGFR affects the prognosis of ccRCC patients, the survival analysis is conducted utilizing clinical data and expression of EGFR-encoded protein obtained from the TCGA dataset. We obtained clinical data and reverse-phase protein array (RPPA) data of ccRCC patients from TCGA. Progression-Free Survival (PFS)⁷⁷ and Disease-Specific Survival (DSS)⁷⁸ were utilized to assess the relationship between the expression levels of protein encoded by EGFR and the survival of ccRCC patients. The ccRCC patients from TCGA were categorized into two groups (top 25% and others) based on the expression levels of protein encoded by EGFR. We analyzed the ten-year survival rate following cancer diagnosis and plotted Kaplan-Meier survival curves to illustrate the impact of EGFR encoded protein expression on the prognosis of patients (Fig. 3C, D). The results of the survival analysis revealed a significant association between the expression levels of protein encoded by EGFR and the survival time of ccRCC patients, with higher EGFR encoded protein expression being correlated with poorer survival outcomes (KM log-rank p-value = 0.0018, PFS; KM log-rank p-value = 0.021, DSS). Our observations revealed that the overexpression of protein encoded by EGFR may play a critical role in cancer progression and could potentially serve as a prognostic biomarker for ccRCC patients.

EGFR effectively promotes ccRCC progression in vitro and in vivo

To verify the effect of EGFR expression on ccRCC progression, the stable cells harboring EGFR knockdown were obtained using ACHN (Fig. 4A, B) and 786-O cells (Fig. 4G, H). The protein and mRNA expression of EGFR was quantitated by western blotting and qPCR, respectively, and short hairpin RNA (shRNA)-2/3 showed promising knockdown epointing out this issuefficacies for EGFR silencing (Fig. 4A, B, G, H). The downregulation of EGFR significantly inhibits the cell viability (CCK8 assay, Fig. 4C-ACHN, Fig. 4I-786-O) and migration (transwell assay, Fig. 4E-ACHN, Fig. 4K-786-O). Furthermore, decreased EGFR expression markedly promotes the ccRCC cell apoptosis (flow cytometry assay, Fig. 4D-ACHN, Fig. 4J-786-O) and represses cell colony formation in vitro (colony assay, Fig. 4F-ACHN, Fig. 4L-786-O). Also, the EGFR overexpression was detected using qPCR (Fig. 4M) and western blotting (Fig. 4N), and it promotes the migration of 786-O cells significantly (Fig. 4O). Moreover, the EGFR inhibitor, erlotinib, was further used to inhibit its activity, and the data indicated that erlotinib can effectively inhibit ccRCC cell migration (Fig. 5A middle panel) and growth (Fig. 5A lower panel) as well as promotes the apoptosis (Fig. 5A upper panel) in vitro. In vivo, both erlotinib and EGFR downregulation can markedly repress the tumor growth (Fig. 5B, C). Taken together, these results suggested that EGFR can significantly promote ccRCC cell progression as a risk factor.

**Fig. 4: EGFR dysregulation significantly affect ccRCC cell progression in vitro.**

**Fig. 5: EGFR inhibition represses ccRCC cell progression in vitro and in vivo.**

The PCLO knockdown significantly impaired the ccRCC progression

To investigate the effect of PCLO on the progression of ccRCC in vitro, we utilized four distinct shRNAs to knock down PCLO expression in 293T cells, identifying shPCLO-1 as the most effective (Fig. 6A, B). Subsequently, we obtained the PCLO knockdown 786-O cells using shPCLO-1 (Fig. 6C). The downregulation of PCLO significantly inhibited cell viability (Fig. 6D) and promoted apoptosis (Fig. 6E). Additionally, the reduction of PCLO expression significantly suppressed cell colony formation (Fig. 6F), migration (Fig. 6G), and invasion (Fig. 6H). To further investigate the effects of PCLO knockdown on the morphology and motility of ccRCC cells, a high-content imaging analysis system was employed. The results revealed that PCLO knockdown significantly reduced both the length and width of the cells at multiple time points (Fig. 6I), as well as their area and perimeter (Fig. 6J). Notably, a reduction in cell movement speed was observed in 786-O-shPCLO cells (Fig. 6K), leading to a shortened accumulated distance (Fig. 6L). In addition, we performed the same intervention on PCLO in another renal cancer cell line, ACHN, and conducted related assays to assess proliferation and metastatic potential. The results showed that stable knockdown of PCLO in ACHN cells (Supplementary Fig. 24A) significantly reduced their proliferation (Supplementary Fig. 24B), colony formation (Supplementary Fig. 24D), migration (Supplementary Fig. 24E), and invasion (Supplementary Fig. 24F) abilities, while increasing the apoptosis index (Supplementary Fig. 24C). High-content imaging analysis system also indicated that knockdown of PCLO in ACHN not only decreased cell perimeter but also impaired the cells’ motility (Supplementary Fig. 24G–I). Overall, the PCLO knockdown markedly inhibited the progression of ccRCC.

**Fig. 6: Loss of PCLO significantly impaired the ccRCC progression.**

Discussion

In this study, we developed RL-GenRisk, an approach utilizing deep reinforcement learning to enhance ccRCC risk gene identification by integrating network knowledge with gene mutation data. By considering the risk gene identification as a node selection problem, we model the ccRCC risk gene identification as a Markov Decision Process, reducing the dependency on labeled data. Furthermore, we designed a data-driven reward and employed the DQN algorithm to optimize RL-GenRisk. RL-GenRisk exhibits a substantial improvement in the task of ccRCC risk gene identification and reveals several potential risk genes.

RL-GenRisk successfully identified known ccRCC risk genes. Among the top 20 ccRCC risk genes identified by RL-GenRisk, 12 genes are listed in the ccRC known risk gene datasets (IntOGen, NCG, and the Merged dataset) and recognized as ccRCC risk genes in prior research, including VHL^79,80, PBRM1^81,82, SETD2^83,84, BAP1^85,86, MTOR^87,88, ATM^89,90, SPEN⁹¹, TP53⁹², KMT2D (also known as MLL2)⁹³, SMARCA4^94,95, and PTEN^96,97. This demonstrates that RL-GenRisk can effectively identify ccRCC risk genes. Interestingly, 8 of the top 20 identified ccRCC risk genes are not listed in the known ccRCC risk gene datasets, including MUC4, DST, PABPC1, PCLO, PDE4DIP, USH2A, EGFR, and FLG. We visualized the top 20 genes identified by RL-GenRisk in the PPI network using the STRING-db. These genes were significantly interconnected within the PPI network (one-sided t-test, FDR p-value = 2.72e-07). To illustrate the interactions between known ccRCC risk genes and the potential risk genes, we used a bipartite graph (see Supplementary Fig. 16). Among the potential risk genes, EGFR exhibited the most interactions with known ccRCC risk genes, followed by MUC4 and PCLO. Notably, both EGFR and PCLO were validated in our biological experiments. In addition, some recent studies have found that PDE4DIP, FLG, and USH2A are associated with ccRCC. For example, methylation levels of PDE4DIP were found to be associated with reduced overall survival in ccRCC patients⁹⁸. FLG was found to be specifically mutated in specific subtypes of ccRCC⁹⁹. USH2A was found to have a significant co-mutation with well-known high-confidence ccRCC risk genes VHL and PBRM1 in ccRCC patients^100,101. While the genes within these identified candidates have been shown a correlation with ccRCC, their molecular mechanism within ccRCC needs to be further investigated.

Furthermore, we validated two top ccRCC risk genes, EGFR and PCLO, identified by RL-GenRisk through independent datasets and biological experiments. The results indicated that EGFR exhibits significant upregulation at both bulk and single-cell levels in ccRCC patients. The overexpression of the protein encoded by EGFR is significantly associated with poor survival of ccRCC patients. To explore whether the few EGFR mutations detected in ccRCC patients are functionally relevant, we analyzed EGFR protein expression levels between ccRCC patients with and without EGFR mutations, as proteins are the primary functional molecules in cellular processes. We use the ccRCC patient data from TCGA to perform analysis, dividing them into groups with and without EGFR mutations to assess whether the protein expression levels significantly differ between these two groups. The results showed that EGFR protein expression levels were higher in ccRCC patients carrying EGFR mutations (p-value = 0.017, Wilcoxon signed-rank test, Supplementary Fig. 17). Moreover, we visualized the interacting genes of EGFR using STRING-db (Supplementary Fig. 20). It is shown that EGFR interacts with many genes. To further explore the effect of these genes, we conducted pathway enrichment analysis on these genes. The results showed that one ccRCC-related pathway and three cancer-related pathways are enriched, including the ErbB signaling pathway (FDR p-value = 4.99e-2), the PI3K Akt signaling pathway (FDR p-value = 3.41e-7), MAPK signaling pathway (FDR p-value = 2.99e-8), and the CKAP4 signaling pathway map (FDR p-value = 3.54e-7). The ErbB signaling pathway has been found to play a critical role in the initiation and progression of ccRCC¹⁰². The PI3K Akt signaling pathway plays a crucial role in various cellular processes and is aberrantly activated in cancers, contributing to the occurrence and progression of tumors¹⁰³. The MAPK signaling pathway is one such complex interconnected signaling cascade with frequent involvement in oncogenesis, tumor progression, and drug resistance¹⁰⁴. The CKAP4 signaling pathway map has been found to be involved in regulating the progression of various cancers¹⁰⁵. The enrichment analysis results suggested that the genes interacting with EGFR in the PPI network were significantly enriched in pathways related to the initiation and progression of ccRCC, as well as tumor initiation and progression. Furthermore, through comprehensive biological experiments, we validated the impact of EGFR and PCLO on ccRCC progression. The results demonstrated that decreased EGFR expression promotes ccRCC cell apoptosis and suppresses colony formation. To confirm the status of EGFR expression in ccRCC cells, we performed additional experiments. We found that the mRNA and protein expression levels of EGFR in the 786-O and ACHN cells are significantly higher than that in the human embryonic kidney (293T) cells using qPCR and western blotting analyses (as shown in Supplementary Fig. 23). Accordingly, these results showed that EGFR is overexpressed in both 786-O and ACHN cell lines. Recent studies have indicated that the downregulation of certain receptor tyrosine kinases (RTKs) members can prevent the progression of RCC. For instance, silencing AXL and MET using shRNA may overcome the resistance to long-term sunitinib treatment in metastatic RCC¹⁰⁶. Additionally, inhibition of EphA2 can suppress tumor growth both in vitro and in vivo, and restore the sensitivity of sunitinib-resistant tumor cells to sunitinib¹⁰⁷. Therefore, we proposed that other RTKs could produce the same effect, even these RTKs would play a synergistic role with EGFR in ccRCC. However, their roles in ccRCC progression need to be further validated in the future. Additionally, the use of the EGFR inhibitor erlotinib effectively inhibits ccRCC cell migration and growth in mice. While previous studies have associated EGFR expression with RCC, functional studies directly examining EGFR’s role in RCC cell lines have been limited. Lee et al. utilized shRNA to knock down EGFR on 786-O cells, suggesting that EGFR knockdown inhibits the invasiveness of RCC cells in vitro and tumorigenicity in vivo¹⁰⁸. However, their conclusions were primarily based on cell proliferation and branching morphogenesis assays, as well as tumor xenograft experiments. Similarly, Wen et al. found that EGFR knockdown reversed ADAMTS1-induced prometastatic characteristics of RCC, this was only verified by CCK8 and Matrigel invasion assay¹⁰⁹. Therefore, comprehensive experiments are essential to further elucidate the role of EGFR in ccRCC cell lines. In our study, we stably knocked down EGFR expression in both 786-O and ACHN cells using shRNA. We conducted a series of experiments, including CCK8, apoptosis assays, transwell assays, colony formation assays, and in vivo experiments, which comprehensively explored the function of EGFR in ccRCC. Our results demonstrate that knocking down EGFR significantly inhibits both the in vitro proliferation and metastasis of ccRCC cells, as well as tumor growth in vivo. These results not only demonstrate RL-GenRisk’s capability in identifying ccRCC risk genes but also provide further insights into the role of EGFR in ccRCC cell lines. Moreover, the biological experimental results of PCLO showed that knocking down PCLO expression in vitro significantly inhibited ccRCC progression. These biological experimental results highlighted the potential therapeutic significance of our findings.

In addition, we explored incorporating mutation type information and pathogenicity scores into the feature vectors of RL-GenRisk. For mutation type, we utilized the TCGA data to calculate the proportion of each mutation type relative to the total number of mutations for each gene. For the pathogenicity score, we computed the CADD score¹¹⁰ for each mutation, and the average CADD score across all mutations in a gene was used as the pathogenicity score feature. By adding different types of features, we evaluate three variations of RL-GenRisk, including mutation type only, pathogenicity score only, and both. The results show that after adding mutation type information separately to the feature vector, RL-GenRisk showed improvement in AP on the NCG dataset but showed decreases in DCG, N-DCG, and DCG-AUC on the NCG dataset, as well as decreases in AP, DCG, N-DCG, and DCG-AUC on the IntOGen and the Merged dataset (as shown in Supplementary Fig. 13). After adding pathogenicity scores separately to the feature vector, RL-GenRisk showed improvements in AP, DCG, and N-DCG on the NCG and decreases in DCG-AUC on the NCG and the Merged dataset, as well as decreases in AP and DCG-AUC on the IntOGen dataset (as shown in Supplementary Fig. 13). Moreover, after adding both mutation type information and pathogenicity scores to the feature vector, RL-GenRisk showed a decrease in AP, DCG, N-DCG, and DCG-AUC across the IntOGen, NCG, and the Merged dataset (as shown in Supplementary Fig. 13). These results indicated that adding more clinically informative features might improve the performance of the method, but further exploration is needed.

Furthermore, we analyzed the overlap between the results of different methods by calculating the Jaccard similarity coefficient for the top 100 genes identified by any two methods (Supplementary Fig. 18). The result indicated that the top 100 genes identified by Hierarchical HotNet and HotNet2 have the highest degree of overlap (with Jaccard similarity coefficient = 0.39), as they are similar methods. In general, there was a low overlap between the top 100 genes identified by different methods (Supplementary Fig. 18). Only one gene VHL is included in all the top 100 genes from different methods. VHL is a known risk gene for ccRCC and with the highest mutation frequency in TCGA ccRCC patients. In addition, seven genes were identified in the top 100 genes by at least six out of the nine methods, including VHL, PBRM1, SETD2, BAP1, MTOR, SPEN, and ATM. These six genes are all well-established risk genes for ccRCC. These results suggested that while different methods for ccRCC risk gene identification exhibit different performances, few genes like VHL, consistently emerge across multiple methods. Then, we conducted an analysis to determine the minimum number of genes required to encompass all known ccRCC risk genes for each method. Since nCOP only outputs a ranked list of genes and does not allow users to adjust the list length by setting different thresholds, we are unable to assess the minimum number of genes required to encompass all known ccRCC risk genes. For other methods, the results show that all tested methods require at least more than 8000 genes to encompass all known ccRCC risk genes (Supplementary Table 7). This leads to a very low proportion of known risk genes in the identified gene list ( <1%). Including all known risk genes introduces numerous false positives, which significantly decreases the precision of the results. A potential future direction could involve developing methods that can identify all (or most) known risk genes while maintaining high precision.

Since the interpretability of most reinforcement learning systems remains a significant challenge, to enhance understanding, we have incorporated several cues in the output to explain why certain genes are more or less likely to be associated with ccRCC. Our hypothesis is that genes with high ranks are likely to be in close proximity to known ccRCC-related genes. We have provided two types of supplementary information to measure the closeness between the predicted genes and known high-risk ccRCC genes. First, other than gene rank, we also output the average shortest path length between each gene and all known risk genes in the PPI network, as well as the FDR p-values obtained by the one-sided t-test that measures whether the average shortest path length between each gene and the known risk genes was significantly shorter. Second, we output the average cosine similarity between each gene’s feature embeddings and the feature embeddings of known risk genes, as well as the FDR p-values obtained by the one-sided t-test that measures whether the average cosine similarity of feature embeddings between each gene and the known risk genes was significantly higher. Supplementary Table 6 shows an example of the output of the top genes identified by RL-GenRisk. To further strengthen these analyses and facilitate intuitive interpretation, RL-GenRisk can output the comparison of the top K genes with K randomly selected genes (randomly sampled 100 times) in terms of their average shortest path lengths to known risk genes in the PPI network, as well as their average cosine similarities of feature embeddings with known risk genes. Users can evaluate the reasonableness of the top K predicted genes. For illustration purposes, we set K = 20 in this example. The results show that the average shortest path lengths between the top 20 genes identified by RL-GenRisk and known risk genes within the PPI network are significantly shorter (FDR p-value = 2.73e-4, one-sided t-test, Supplementary Fig. 21A), and the average cosine similarities of the feature representations between the top 20 genes and known risk genes are significantly higher (FDR p-value = 4.61e-8, one-sided t-test, Supplementary Fig. 21B). These results indicate that the top 20 genes identified by RL-GenRisk are closely connected to known risk genes in both the network structure and feature space.

To illustrate whether the top 20 genes identified by each method are included in the known ccRCC risk gene datasets, we provided heatmaps in Supplementary Fig. 4. The results indicated that RL-GenRisk identified the highest number of known risk genes (12 genes) among the top 20 genes, followed by nCOP (11 genes), highlighting RL-GenRisk’s capability in identifying ccRCC risk genes. Additionally, we provided the runtime of various methods in Supplementary Table 5. The results showed that HotNet2 and Hierarchical HotNet had the longest runtimes among the methods tested. In comparison, RL-GenRisk was completed in approximately 8 hours. Although some methods had shorter runtimes than RL-GenRisk, their overall performance was inferior to that of RL-GenRisk.

RL-GenRisk can also be applied to identify risk genes for other types of cancer. We evaluated RL-GenRisk and other methods using additional tumor datasets, specifically focusing on Bladder Urothelial Carcinoma (BLCA) and Glioblastoma Multiforme (GBM) data from the TCGA dataset. The known risk genes for BLCA and GBM were extracted from the same source as ccRCC in our study. The results indicated that RL-GenRisk outperformed other methods on both BLCA and GBM (as shown in Supplementary Figs. 14 and 15), highlighting the potential of RL-GenRisk for application in other cancer types.

Methods

Our research complies with all relevant ethical regulations and guidelines. Animal handling and experimental procedures were approved by the Ethical Review Committees of West China Hospital, Sichuan University. The animal studies were authorized by the Animal Ethics Review Committees of the West China Hospital, China (No. 20230214007). All animal experiments were strictly implemented in compliance with the NIH Guide for the Care and Use of Laboratory Animals.

Data preparation

Consistent with previous studies^29,31, we used the information from the PPI network together with gene mutation data from patients. The feature matrix H for nodes is constructed based on the topological information of genes in the PPI network, the mutation frequency of genes in ccRCC patients, and the genes’ length.

PPI networks

We collected protein-protein interactions from HPRD⁴⁹, STRING-db⁵⁰, Multinet⁵¹, IRefIndex⁵², and HumanNet⁵³. Then, following the previous study³¹, we performed a two-step preprocess on these PPI networks. First, we excluded the nine longest genes (TTN, MUC16, SYNE1, NEB, MUC19, CCDC168, FSIP2, OBSCN, GPR98) as they tend to acquire numerous mutations by chance and cover many patients³¹. Second, to mitigate the noise due to the dense connectivity in PPI networks, we applied the diffusion state distance (DSD)¹¹¹ metric on these PPI networks.

Gene mutation data

We collected somatic mutation of genes in ccRCC patients from The Cancer Genome Atlas. Each gene corresponds to a patient list containing patients who carry mutations in that gene.

Feature representation

The initial feature of gene v is represented as ${{{{\bf{h}}}}}_{v}\in {{\mathbb{R}}}^{1\times k}$, where k is the dimension. In RL-GenRisk, the node feature is flexible and can have different dimensions. We set k = 3 in RL-GenRisk. Therefore, the initial feature of gene v can be represented as ${{{{\bf{h}}}}}_{v}=[{h}_{v}^{(0)},{h}_{v}^{(1)},{h}_{v}^{(2)}]$, which respectively measures the topological importance of v in the PPI network, the mutation frequency in patients, and the length information of the gene. We first use the degree of v as the first dimension of h_v since it can represent the topological importance of v in the PPI network:

$${h}_{v}^{(0)}=\frac{{{{\rm{Degree}}}}(v)}{{\mu }_{degree}},$$

(1)

where Degree(v) represents the node degree of gene v in the PPI network. μ_degree is a hyperparameter for normalization. We set μ_degree equal to the maximum degree of nodes in the PPI network. Then, the second dimension of h_v considers the mutation frequency of gene v in ccRCC patients:

$${h}_{v}^{(1)}=\frac{{N}_{p}^{v}}{{N}_{p}},$$

(2)

where ${N}_{p}^{v}$ represents the number of patients carrying mutation on v and N_p represents the total number of patients. Moreover, the initial feature should consider the length information of the gene, since the longer genes tend to include more mutations by change. To dilute this type of effect, the third dimension of h_v is designed as follows:

$${h}_{v}^{(2)}=\frac{{{{\rm{Length}}}}(v)}{\alpha \,{N}_{p}^{v}},$$

(3)

where Length(v) represents the gene length of v, and α is a normalization parameter. We set α equal to the node number of a connected subgraph in the PPI network that each patient carried mutations on at least one gene in this subgraph. Finally, the initial feature of gene was defined as ${{{{\bf{h}}}}}_{v}=[{h}_{v}^{(0)},{h}_{v}^{(1)},{h}_{v}^{(2)}]$, and the initial feature matrix for all genes was represented as ${{{\bf{H}}}}\in {{\mathbb{R}}}^{n\times k}$, where n is the number of genes.

Key elements in RL-GenRisk

In RL-GenRisk, the identification of ccRCC risk genes is framed as a Markov Decision Process. At each step, the policy of RL-GenRisk receives the current state as input and selects an action. The action is selecting a node that connects with the sampled subgraph and appending this node to the sampled subgraph. Thus, this sampled subgraph adds a new node at each step. A reward is obtained after taking an action. Therefore, there are three key elements in RL-GenRisk, including state, action, and reward. These three key elements are defined as follows:

State

The state s at step t is represented as s_t, which consists of the feature matrix H and the current subgraph G_t sampled from the PPI network. In the first step, RL-GenRisk creates an empty subgraph and randomly adds a node to it. The design of incorporating the sampled subgraph into the state allows RL-GenRisk to delve into localized network structures. This enables RL-GenRisk to focus on the interaction relationships relevant to the current sampled subgraph at each step. Therefore, the subgraph information in the state enables RL-GenRisk to accurately estimate its current environment and adjust its actions accordingly.

Action

At step t, the action a_t is selecting a node that connects with the sampled subgraph G_t and appending this node to G_t. The action space at step t is represented as A_t, which contains nodes that connect with G_t. In detail, after getting Q values for all possible actions, RL-GenRisk uses an ϵ-greedy strategy to select an action a_t:

$${a}_{t}=\left\{\begin{array}{ll}{{{\rm{random}}}}\,a\in {{{{\bf{A}}}}}_{t} \hfill &{{{\rm{if}}}}\,p \, < \, \epsilon \\ \arg {\max }_{a\in {{{{\bf{A}}}}}_{t}}Q({{{{\bf{s}}}}}_{t};a;\theta )&{{{\rm{otherwise}}}},\end{array}\right.$$

(4)

where Q(s_t; a; θ) represents the Q value of action a calculated by the policy based on the current state s_t, θ stands for the parameters of the policy. A_t represents the action space at step t. With a probability p not exceeding ϵ, we select an action in action space randomly. Otherwise, the action with the highest Q value is selected. The ϵ-greedy strategy can enhance the exploratory nature of the policy, effectively preventing the model from getting stuck in local optimal during training. Same with the previous study¹¹², we set ϵ equal to 0.95. To balance exploration and exploitation, ϵ is decreased gradually during the training process.

Reward

In reinforcement learning algorithms, the design of the reward is crucial to the performance of the algorithm. In this study, we designed a specific data-driven reward based on the current sampled subgraph. The sampled subgraph G_t at step t is expected to cover more patients so that risk genes are more likely to appear in it. Instead of focusing on an individual gene, RL-GenRisk focuses on a sampled subgraph G_t with node feature matrix H. This makes RL-GenRisk effectively leverage information from both the PPI network and gene mutation data. More importantly, this design of the reward ensures the accurate identification of genes with high mutation frequencies, and also identifies potential risk genes with low mutation frequencies but functionally interacting with genes that have high mutation frequencies. However, only considering the number of patients covered by the sampled subgraph G_t is not enough. Longer genes are more likely to mutate by chance in patients. Therefore, shorter genes with high mutation frequencies are more likely to be risk genes³¹. To take gene length into account, we integrate gene length information when designing the reward. Thus, the single-step reward is higher when the sampled subgraph covers more patients and the genes in the subgraph have shorter lengths. The single-step reward r_t at step t is designed as:

$${r}_{t}= {R}_{t}-{R}_{t+1},\\ {R}_{t}= \delta \left(1-\frac{{N}_{p}^{{G}_{t}}}{{N}_{p}}\right)+(1-\delta ){\sum}_{v\in {G}_{t}}\frac{{{{\rm{Length}}}}(v)}{{N}_{p}^{v}},$$

(5)

where r_t represents the single-step reward at step t. δ is a weight hyperparameter. R_t provides an evaluation score of the current state, with ${N}_{p}^{{G}_{t}}$ representing the number of patients that carry at least one mutation on the genes in the sampled subgraph G_t, ${N}_{p}^{v}$ represents the number of patients carrying mutations on gene v, and N_p represents the total number of patients. Length(v) represents the length of gene v. The cumulative reward is defined as the sum of all single-step rewards.

Policy network

In RL-GenRisk, the policy takes the current state as input and outputs Q values for all possible actions. The policy of RL-GenRisk is represented by a neural network which is usually referenced as the policy network. Specifically, the policy of RL-GenRisk comprises two main components: a graph convolutional network (GCN) and a node evaluation network. We also use three multi-layer perceptrons (MLP) to perform dimensionality transformation. The GCN aggregates neighborhood information to get the representation for each node by multiplying the graph laplacian⁴² with the node feature matrix. Given the feature matrix ${{{\bf{H}}}}\in {{\mathbb{R}}}^{n\times k}$, n is the number of nodes and k is the dimension, the hidden representation of the l-th graph convolutional layer is calculated as:

$${{{{\bf{H}}}}}^{l}=\sigma ({{{\bf{L}}}}{{{{\bf{H}}}}}^{l-1}{{{{\bf{W}}}}}^{l}),$$

(6)

where H^l represents the hidden representation of the l-th graph convolutional layer. ${{{\bf{L}}}}={\hat{{{{\bf{D}}}}}}^{-\frac{1}{2}}\hat{{{{\bf{A}}}}}{\hat{{{{\bf{D}}}}}}^{-\frac{1}{2}}$ represents the normalized graph laplacian, which is used to aggregate neighborhood information. GCN preserves the original node signal by adding self-connections: $\hat{{{{\bf{A}}}}}=\tilde{{{{\bf{A}}}}}+{{{\bf{I}}}}$, $\tilde{{{{\bf{A}}}}}$ represents the adjacency matrix and I represents the identity matrix. $\hat{{{{\bf{D}}}}}$ represents the degree matrix for $\hat{{{{\bf{A}}}}}$ and ${\hat{{{{\bf{D}}}}}}_{ii}={\sum }_{j}{\hat{{{{\bf{A}}}}}}_{ij}$, W^l represents the trainable matrix of the l-th graph convolutional layer. σ represents the non-linear activation function.

In RL-GenRisk, we use two graph convolutional layers. Before the first graph convolutional layer receives data, we use an MLP to perform dimensionality transformation on the initial features:

$${{{{\bf{H}}}}}^{0}={{{{\bf{W}}}}}_{1}{{{\bf{H}}}},$$

(7)

where H represents the initial feature matrix, W₁ represents a trainable projection matrix. H⁰ represents the input feature matrix of the first graph convolutional layer.

Inspired by residual networks¹¹³, we concatenate the hidden representation matrices of two graph convolutional layers with H⁰, and use another MLP for dimension transformation to obtain the final representations matrix $\hat{{{{\bf{H}}}}}$:

$$\hat{{{{\bf{H}}}}}={{{{\bf{W}}}}}_{2}(\,{\mbox{Concat}}\,({{{{\bf{H}}}}}^{0},{{{{\bf{H}}}}}^{1},{{{{\bf{H}}}}}^{2})),$$

(8)

where H¹, H² represent the hidden representation matrices of the two graph convolutional layers, W₂ represents a trainable projection matrix, and Concat( ⋅ ) represents the concatenation operator.

At step t, after getting the representations matrix by GCN, RL-GenRisk calculates the representation of the sampled subgraph through the averaging pooling operation and uses the MLP for dimension transformation:

$${\hat{{{{\bf{H}}}}}}_{{G}_{t}}={{{{\bf{W}}}}}_{3}\left(\frac{{\sum }_{v\in {G}_{t}}{\hat{{{{\bf{H}}}}}}_{v}}{{{{\rm{Size}}}}({G}_{t})}\right),$$

(9)

where ${\hat{{{{\bf{H}}}}}}_{{G}_{t}}$ represents the representation of sampled subgraph G_t, ${\hat{{{{\bf{H}}}}}}_{v}$ represents the representation of gene v, Size( ⋅ ) represents the number of nodes in the sampled subgraph, and W₃ represents a trainable projection matrix.

Then, we use a two-layer MLP as the node evaluation network to calculate the Q values for each action in action space:

$$Q({{\bf{s}}}_{t};a;\theta )={{{{\bf{W}}}}}_{5}(\sigma ({{{{\bf{W}}}}}_{4}({{{\rm{Concat}}}}({\hat{{{{\bf{H}}}}}}_{{G}_{t}},{\hat{{{{\bf{H}}}}}}_{v})))),$$

(10)

where Q(s_t; a; θ) represents the Q value for action a based on the current state s_t, θ stands for the parameters of the policy. ${\hat{{{{\bf{H}}}}}}_{{G}_{t}}$ represents the representation of sampled subgraph G_t, v represents the gene that selected by action a, and ${\hat{{{{\bf{H}}}}}}_{v}$ represents the representation of gene v. W₄ and W₅ represent two trainable projection matrices. σ represents the non-linear activation function. By using a two-layer MLP that can continuously update parameters during the training process as the node evaluation network, RL-GenRisk can better predict Q values for actions. More details about the policy network can be found in Supplementary Table 3.

Policy training for high-confidence risk gene identification

Consistent with the previous study³¹, we sampled the training data, randomly collecting 85% samples from 379 ccRCC patients before the training process started. In the training process, since we model the ccRCC risk gene identification as a Markov Decision Process, the policy of RL-GenRisk iteratively receives the current state as input, selects an action, and obtains a reward. The DQN algorithm⁴³, which is widely used in reinforcement learning methods, is employed to train the policy of RL-GenRisk. DQN uses two sets of Q values calculated by an online network and a target network, respectively. The online network in our study is the neural network of RL-GenRisk. The target network is designed to prevent the online network from overestimating Q values. The architecture of the target network is the same as that of the online network, but their parameters are different. In RL-GenRisk, the loss function is defined as follows:

$${{{\mathcal{L}}}}={\mathbb{E}}[{({r}_{t}+\gamma {\max }_{{a}_{t+1}\in {{{{\bf{A}}}}}_{t+1}}Q({{{{\bf{s}}}}}_{t+1};{a}_{t+1};{\theta }^{{\prime} })-Q({{{{\bf{s}}}}}_{t};{a}_{t};\theta ))}^{2}],$$

(11)

where ${{{\mathcal{L}}}}$ represents the loss to be minimized. r_t stands for the reward received at the step t, and γ is the discount factor. ${\max }_{{a}_{t+1}\in {{{{\bf{A}}}}}_{t+1}}Q({{{{\bf{s}}}}}_{t+1};{a}_{t+1};{\theta }^{{\prime} })$ represents the target network estimate of the maximum expected Q value for the next state s_t+1 and possible action a_t+1. A_t+1 represents the action space. Q(s_t; a_t; θ) represents the Q value calculated by the online network based on the current state s_t and action a_t. θ and ${\theta }^{{\prime} }$ represent the parameters of the online network and target network, respectively. θ is updated through gradient backward based on the loss function. Then ${\theta }^{{\prime} }$ is updated through soft updates based on the parameters of the online network as follows:

$${\theta }^{{\prime} }=\tau \theta+(1-\tau ){\theta }^{{\prime} },$$

(12)

where τ represents a hyperparameter that controls the proportion of each update. More details about the hyperparameters in RL-GenRisk are shown in Supplementary Table 4. Moreover, we provided recommended ranges for each hyperparameter in Supplementary Table 4. Users can select hyperparameters within the recommended ranges. Additionally, users who wish to try a wider range of hyperparameter values can also use the grid search¹¹⁴ to select hyperparameter values.

In the identification process, RL-GenRisk differs from existing methods that predict outcomes following the same procedure as the training process^23,29,31. Instead, capitalizing on the advantage of RL-GenRisk combined with reinforcement learning, we have designed a concise and effective identification process. Specifically, the model with the highest cumulative reward during the training process is selected as the best model and loaded. The sampled subgraph is initialized as empty, and all nodes in the PPI network are included in the action space at the beginning. The policy calculates Q values for all genes, and subsequently, the final ranking list of ccRCC risk genes is ordered based on these Q values. The higher Q value indicates higher risk. We analyzed the top 20 high-confidence risk genes in our study:

$${HRG}_{top20}=\{g\in G| \, Q(g)\in {Q}_{top20}\},$$

(13)

where HRG_top20 represents the top 20 high-confidence risk genes. g represents a gene and G represents the PPI network. Q(g) represents the Q value of gene g and Q_top20 represents the top 20 highest Q values in the set of all Q values calculated at the identification process. Researchers can select the number of top genes for verification based on actual verification costs.

Performance evaluation

Known ccRCC risk genes were sourced from the IntOGen cancer-specific database⁴⁰, the Network of Cancer Genes (NCG) database⁵⁴, the Cancer Gene Census (CGC) database⁵⁵, and a cancer risk gene set derived from a recent comprehensive study by Bailey et al.⁵⁶. The known ccRCC risk genes used in our study are provided in Supplementary Data 1. Due to the limited number of genes in the CGC (nine genes) and Bailey et al. (twelve genes) lists, we ultimately used three datasets to evaluate different methods: IntOGen, NCG, and a “Merged” dataset that combined IntOGen, NCG, Bailey et al., and CGC. First, the performance was evaluated using the Discounted Cumulative Gain (DCG), which has been used for evaluating cancer risk gene identification in the previous study²⁹. The DCG is calculated as:

$${{{\rm{DCG}}}}={\sum }_{i=1}^{N}\frac{re{l}_{{g}_{i}}}{lo{g}_{2}(i+1)},$$

(14)

where N represents the number of the identified ccRCC risk genes. $re{l}_{{g}_{i}}$ is equal to 1 if the i-th gene g_i is contained in the known risk gene database, and 0 otherwise. Therefore, the higher the ranking of known ccRCC risk genes in the prediction results, the higher the DCG score. Following the previous study³¹, the top 100 identified ccRCC risk genes were considered in the performance evaluation. The top 100 genes identified by different methods are listed in Supplementary Data 2. Additionally, to provide a more comprehensive evaluation of different methods, we presented DCG curves and calculated the area under the DCG curve. We also used the normalized DCG (N-DCG) and the average precision (AP) as evaluation metrics. The N-DCG is calculated as:

$${{\mbox{N-DCG}}}= \frac{\,{\mbox{DCG}}}{{\mbox{I-DCG}}\,},\\ \,{\mbox{I-DCG}}\,= \sum_{i=1}^{N}\frac{re{l}_{{g}_{i}}^{ideal}}{{\log }_{2}(i+1)}$$

(15)

where $re{l}_{{g}_{i}}^{ideal}$ is the relevance score of the item at g_i in the ideal ranking. The AP is calculated as:

$${{\mbox{AP}}}\,=\frac{1}{N}\sum\limits_{i=1}^{100}P(i)\cdot re{l}_{{g}_{i}},$$

(16)

where N represents the number of the ccRCC risk genes. P(i) represents the precision for the top i genes, which is calculated by dividing the number of known risk genes in identified genes by the total number of identified genes. $re{l}_{{g}_{i}}$ is equal to 1 if the i-th gene g_i is contained in the known risk gene database, and 0 otherwise.

Threshold selection for top K genes

To determine the optimal number of top K genes for downstream analysis, we evaluated the proportion of known ccRCC risk genes across various values of K (where K = 20, 30, 40, …, 100). We observed that when K was set to 20, the average proportion of known ccRCC risk genes exceeded 50% (Supplementary Fig. 19). As K increased, the proportion of known ccRCC risk genes decreased (Supplementary Fig. 19). Based on this observation, we suggested that selecting K = 20 generates a gene set with a higher likelihood of including unknown ccRCC risk genes. Additionally, considering the potential downstream analysis, selecting K = 20 kept the analysis cost within a manageable range in our study.

Statistical analysis

Gene set enrichment analysis

We used the g:Profiler⁶¹ for running functional enrichment analysis of the top 20 high-confidence risk genes identified by RL-GenRisk. g:Profiler maps genes to known functional information sources and detects statistically significantly enriched terms. We performed enrichment analysis on Gene Ontology, Human Phenotype Ontology, and WikiPathways. We used FDR p-value < 0.05 as the significance threshold.

Survival analysis

Clinical data of ccRCC patients and protein expression data of EGFR were obtained from TCGA. We used cSurvival¹¹⁵ to perform progression-free survival and disease-specific survival analysis on these data. Progression-free survival utilizes the time from randomization or initiation of treatment to the occurrence of disease progression or death⁷⁷. Disease-specific survival refers to deaths caused specifically by a particular disease⁷⁸. Patients were categorized into quartiles based on the expression levels of protein encoded by EGFR. The Kaplan-Meier estimator was used to generate survival curves, and the difference was assessed using the log-rank test¹¹⁶.

In vitro and in vivo experiments

Cell lines

ACHN and 786-O cells were purchased from Shanghai Zhong Qiao Xin Zhou Biotechnology Co., Ltd. (Shanghai, China), and were cultured in DMEM and RPMI-1640 medium (HyClone, Utah, USA), respectively, supplemented with 10% fetal bovine serum (FBS, Gibco, Australia) and 1% antibiotics (penicillin and streptomycin, HyClone) in a humidified incubator containing 5% CO₂ at 37 °C. The stable cell lines of ACHN-shEGFR, 786-O-shEGFR, 786-O-EGFR, ACHN-shPCLO, and 786-O-shPCLO were obtained as described previously¹¹⁷.

High-content screening analysis

Cells were inoculated into a 96-well plate at an optimal density to ensure ideal growth and interaction. A high-content live-cell imaging system was utilized to monitor the cells at multiple time points, assessing various parameters. This imaging system, equipped with fluorescence microscopy capabilities, allows for real-time visualization of cell morphology and dynamics. At each time point, images were automatically captured, focusing on specific areas of interest within each well. The analysis included measuring cell perimeter, calculating movement speed and accumulated distance. The above analysis was done using Opera Phenix Plus and Harmony 5.2.

Western blot analysis

Total protein was extracted using TPER solution from Thermo Fisher Scientific. The protein concentration was measured using a BCA kit from Pierce. The proteins were separated using 10% SDS-PAGE. The proteins were transferred onto a PVDF membrane, and the membrane was blocked with 5% skimmed milk for 1 h at room temperature (RT). Then, the membrane was incubated with primary rabbit anti-human EGFR antibodies (R22778, ZenBio, WB: 1:1000), rabbit anti-PCLO antibody (HPA015858, Sigma-Aldrich, WB: 1:1000) and mouse anti-GAPDH antibody (250133, ZenBio, WB: 1:5000) at 4 °C overnight. After three washes with TBST for 10 min, the membrane was incubated with horseradish peroxidase (HRP)-conjugated goat antirabbit secondary antibodies (458, MBL) for 1 h at RT. Finally, the protein bands were detected using an HRP substrate. All experiments were performed in triplicate.

Quantitative polymerase chain reaction (qPCR)

Total RNA was extracted from the cells using a QIAGEN kit (74104), and the RNA concentration was determined using a NanoPhotometer from Implen (München). cDNA was synthesized using a RevertAid first-strand cDNA synthesis kit from Thermo Fisher Scientific (Waltham). PCR was performed using the following protocol: 98 °C for 2 min and 40 cycles at 98 °C for 5 s and 60 °C for 10 s. The relative gene expression was calculated using 2^−ΔΔCt. β-Actin was used as an internal control. All experiments were performed more than triplicate.

Cell counting kit-8 assay

The cell counting kit-8 (CCK-8) assay was performed using a kit from Dojindo. A total of 3000 cells/well were cultured in a 96-well plate for 24 h, and 10 μL of CCK-8 solution was added to the wells. Then, the plate was incubated for 2 h at 37 °C, and the optical density (OD) value was detected at 450 nm using a microplate reader. All experiments were performed in triplicate.

Cell apoptosis

An Annexin V-Alexa Fluor 647/PI Kit was purchased from 4A Biotech (Suzhou, China). The cells were digested and washed twice with cold phosphate-buffered saline (PBS). Next, the 1 × binding buffer was used to suspend cells to a concentration of 1−5 × 10⁶ cells/mL. Then 5 μL of Annexin V/Alexa Fluor 647 was added to 100 μL of cells, and the mixture was incubated at RT for 5 min in the dark. Finally, the flow cytometry assay was performed after adding 10 μL of 20 μg/mL propidium iodide (PI) and 400 μL of PBS. Figures illustrating the gating strategy are provided in the Supplementary Materials.

Transwell assay

Transwell migration assays were conducted using a Transwell chamber from Corning (REF 3422, Arizona, USA). Briefly, Transwell chambers were placed on a 24-well plate. Fresh medium containing 10% FBS in 600 μL was added to the lower chambers, and (2–5) × 10⁴ cells in 200 μL of medium without FBS were added to the upper chamber. The 24-well plate was incubated at 37 °C for 48 h. Cells that invaded through the chamber were washed, fixed (20 min with 4% paraformaldehyde), and stained (30 min with crystal violet). Then, the upper chambers were washed, photographed, and preserved under an inverted fluorescence OBSERVER D1/AX10 cam HRC microscope (Zeiss). Transferred cells were analyzed using ImageJ software.

Colony formation assay

The cells (500/well) were seeded into a 6-well plate and cultured at 37 °C for 7–10 days. Then, the clones were imaged using a Celigo imaging cytometer from Nexcelom (Lawrence), and the clones were counted using ImageJ software. All experiments were performed in triplicate.

Animal study

Four-week-old male BALB/C-nu/nu mice were purchased from GemPharmatech. All these mice were maintained in pathogen-free conditions at 24 °C/50% humidity, with a light/dark cycle for 12 h, and given a free supply of food (reproductive diet, catalog number: F010201) and water. A total of 5 × 10⁶ cells were inoculated into the right flank of mice, and the tumor volume was recorded every 3 days starting from day 17 after injection of the tumor cells. Mice were administered with erlotinib (S1023, Selleck, Shanghai, China) at a dose of 50 mg/kg. The tumor volume was calculated using the following equation: L × W² × 0.5236, where L is tumor length and W is tumor width¹¹⁷. The maximum tumor diameter allowed by the Ethics Committee is 2 cm. We ensured that each time mice were sacrificed the maximal body weight loss did not exceed the limit of 20%. The animal procedures were approved by the ethics committee of West China Hospital, Sichuan University.

Statistical analysis

All data are presented as the mean ± standard error of the mean (SEM) or standard deviation (SD). Statistical significance for the comparison of multiple groups (>3) and between the groups was determined using analysis of variance (ANOVA) and Student’s paired t-test, respectively, in GraphPad Prism 9.0. p-value < 0.05 was considered statistically significant.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All datasets analyzed in this article are publicly available. The HPRD network is available at http://www.hprd.org/. STRING is available at https://string-db.org/. Multinet and IRefIndex are available at https://github.com/raphael-group/hotnet2/tree/master/paper/data/networks. HumanNet is available at https://staging2.inetbio.org/humannetv3/. PCNet is available at https://www.ndexbio.org/viewer/networks. The data of ccRCC patients from TCGA is available at https://portal.gdc.cancer.gov/. The single-cell RNA-seq data of ccRCC patients is available at https://singlecell.broadinstitute.org/. The GEO datasets of ccRCC patients are available at https://www.ncbi.nlm.nih.gov/geo/. Data for enrichment analysis is available at https://biit.cs.ut.ee/gprofiler/gost. Source data are provided with this paper.

Code availability

The source code of RL-GenRisk and the trained model can be downloaded from the GitHub repository at https://github.com/23AIBox/RL-GenRisk.

References

Padala, S. A. et al. Epidemiology of renal cell carcinoma. World J. Oncol. 11, 79 (2020).
Article PubMed PubMed Central Google Scholar
Capitanio, U. & Montorsi, F. Renal cancer. Lancet 387, 894–906 (2016).
Article PubMed Google Scholar
John, A., Spain, L. & Hamid, A. A. Navigating the current landscape of non-clear cell renal cell carcinoma: a review of the literature. Curr. Oncol. 30, 923–937 (2023).
Article PubMed PubMed Central Google Scholar
Jonasch, E., Walker, C. L. & Rathmell, W. K. Clear cell renal cell carcinoma ontogeny and mechanisms of lethality. Nat. Rev. Nephrol. 17, 245–261 (2021).
Article CAS PubMed Google Scholar
Hsieh, J. J. et al. Renal cell carcinoma. Nat. Rev. Dis. Prim. 3, 1–19 (2017).
Google Scholar
Sanchez, D. J. & Simon, M. C. Genetic and metabolic hallmarks of clear cell renal cell carcinoma. Biochimica et. Biophysica Acta (BBA)-Rev. Cancer 1870, 23–31 (2018).
Article CAS Google Scholar
Chowdhury, N. & Drake, C. G. Kidney cancer: an overview of current therapeutic approaches. Urologic Clin. 47, 419–431 (2020).
Google Scholar
Cairns, P. Renal cell carcinoma. Cancer Biomark. 9, 461–473 (2011).
Article PubMed Central Google Scholar
Del Vecchio, S. J. & Ellis, R. J. Cabozantinib for the management of metastatic clear cell renal cell carcinoma. J. Kidney Cancer VHL 5, 1 (2018).
Article PubMed PubMed Central Google Scholar
Yoshida, K. et al. Efficacy of axitinib in patients with metastatic renal cell carcinoma refractory to nivolumab therapy. Jpn. J. Clin. Oncol. 49, 576–580 (2019).
Article PubMed Google Scholar
Gao, B. et al. Screening novel drug candidates for kidney renal clear cell carcinoma treatment: A study on differentially expressed genes through the connectivity map database. Kidney Blood Press. Res. 46, 702–713 (2021).
Article CAS PubMed Google Scholar
Garraway, L. A. & Lander, E. S. Lessons from the cancer genome. Cell 153, 17–37 (2013).
Article CAS PubMed Google Scholar
Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Article CAS PubMed PubMed Central Google Scholar
Vogelstein, B. et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Anandakrishnan, R., Varghese, R. T., Kinney, N. A. & Garner, H. R. Estimating the number of genetic mutations (hits) required for carcinogenesis based on the distribution of somatic mutations. PLoS Computational Biol. 15, e1006881 (2019).
Article ADS Google Scholar
Martincorena, I. & Campbell, P. J. Somatic mutation in cancer and normal cells. Science 349, 1483–1489 (2015).
Article ADS CAS PubMed Google Scholar
Tamborero, D. et al. Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci. Rep. 3, 2650 (2013).
Article PubMed PubMed Central Google Scholar
Youn, A. & Simon, R. Identifying cancer driver genes in tumor genome sequencing studies. Bioinformatics 27, 175–181 (2011).
Article CAS PubMed Google Scholar
Dees, N. D. et al. Music: identifying mutational significance in cancer genomes. Genome Res. 22, 1589–1598 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tamborero, D., Gonzalez-Perez, A. & Lopez-Bigas, N. Oncodriveclust: exploiting the positional clustering of somatic mutations to identify cancer genes. Bioinformatics 29, 2238–2244 (2013).
Article CAS PubMed Google Scholar
Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Andrades, R. & Recamonde-Mendoza, M. Machine learning methods for prediction of cancer driver genes: a survey paper. Brief. Bioinforma. 23, bbac062 (2022).
Article Google Scholar
Stratton, M. R., Campbell, P. J. & Futreal, P. A. The cancer genome. Nature 458, 719–724 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, Y., Wang, C., Gao, Y. & Zhou, L. Identification and validation of pik3ca as a marker associated with prognosis and immune infiltration in renal clear cell carcinoma. J. Oncol. 2021, 1–18 (2021).
Google Scholar
Guo, H. et al. The pi3k/akt pathway and renal cell carcinoma. J. Genet. genomics 42, 343–353 (2015).
Article PubMed PubMed Central Google Scholar
Brugarolas, J. Molecular genetics of clear-cell renal cell carcinoma. J. Clin. Oncol. 32, 1968 (2014).
Article CAS PubMed PubMed Central Google Scholar
Cho, A. et al. Muffinn: cancer gene discovery via network analysis of somatic mutation data. Genome Biol. 17, 1–16 (2016).
Article Google Scholar
Cutigi, J. F., Evangelista, A. F., Reis, R. M. & Simao, A. A computational approach for the discovery of significant cancer genes by weighted mutation and asymmetric spreading strength in networks. Sci. Rep. 11, 23551 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Leiserson, M. D. M., Vandin, F., Wu, H. T., Dobson, J. R. & Raphael, B. J. Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet. 47, 106–114 (2015).
Hristov, B. H. & Singh, M. Network-based coverage of mutational profiles reveals cancer genes. Cell Syst. 5, 221–229 (2017).
Article CAS PubMed PubMed Central Google Scholar
Shi, X. et al. Comprehensive evaluation of computational methods for predicting cancer driver genes. Brief. Bioinforma. 23, bbab548 (2022).
Article Google Scholar
Tavanaei, A., Anandanadarajah, N., Maida, A. & Loganantharaj, R. A deep learning model for predicting tumor suppressor genes and oncogenes from pdb structure. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 613–617 (IEEE, 2017).
Agajanian, S., Oluyemi, O. & Verkhivker, G. M. Integration of random forest classifiers and deep convolutional neural networks for classification and biomolecular modeling of cancer driver mutations. Front. Mol. Biosci. 6, 44 (2019).
Article CAS PubMed PubMed Central Google Scholar
Luo, P., Ding, Y., Lei, X. & Wu, F.-X. deepdriver: predicting cancer driver genes based on somatic mutations using deep convolutional neural networks. Front. Genet. 10, 428627 (2019).
Article Google Scholar
Schulte-Sasse, R., Budach, S., Hnisz, D. & Marsico, A. Graph convolutional networks improve the prediction of cancer driver genes. In Artificial Neural Networks and Machine Learning–ICANN 2019: Workshop and Special Sessions: 28th International Conference on Artificial Neural Networks, Munich, Germany, September 17–19, 2019, Proceedings 28, 658–668 (Springer, 2019).
Leiserson, M. D. et al. Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet. 47, 106–114 (2015).
Article CAS PubMed Google Scholar
Pezzicoli, G. et al. Genomic profiling and molecular characterization of clear cell renal cell carcinoma. Curr. Oncol. 30, 9276–9290 (2023).
Article PubMed PubMed Central Google Scholar
Cui, H. et al. Identification of the key genes and pathways involved in the tumorigenesis and prognosis of kidney renal clear cell carcinoma. Sci. Rep. 10, 4271 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Martínez-Jiménez, F. et al. A compendium of mutational cancer driver genes. Nat. Rev. Cancer 20, 555–572 (2020).
Article PubMed Google Scholar
Gershman, S. J. & Daw, N. D. Reinforcement learning and episodic memory in humans and animals: an integrative framework. Annu. Rev. Psychol. 68, 101–128 (2017).
Article PubMed Google Scholar
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. Proceedings of the 5th International Conference on Learning Representations (ICLR) (2017).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Article ADS CAS PubMed Google Scholar
Bellman, R. A markovian decision process. J. Math. Mech. 6, 679–684 (1957).
Weinstein, J. N. et al. The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).
Article PubMed PubMed Central Google Scholar
Reyna, M. A., Leiserson, M. D. & Raphael, B. J. Hierarchical hotnet: identifying hierarchies of altered subnetworks. Bioinformatics 34, i972–i980 (2018).
Article CAS PubMed PubMed Central Google Scholar
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Article Google Scholar
Breiman. Random forests. Mach Learn. 45, 5-32 (2001).
Keshava Prasad, T. et al. Human protein reference database–2009 update. Nucleic acids Res. 37, D767–D772 (2009).
Article CAS PubMed Google Scholar
Szklarczyk, D. et al. String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids Res. 47, D607–D613 (2019).
Article CAS PubMed Google Scholar
Khurana, E., Fu, Y., Chen, J. & Gerstein, M. Interpretation of genomic variants using a unified biological network approach. PLoS Computational Biol. 9, e1002886 (2013).
Article ADS CAS Google Scholar
Huang, J. K. et al. Systematic evaluation of gene networks for discovery of disease genes. Cancer Res. 78, 1310–1310 (2018).
Article Google Scholar
Kim, C. Y. et al. Humannet v3: an improved database of human gene networks for disease research. Nucleic acids Res. 50, D632–D639 (2022).
Article CAS PubMed Google Scholar
Dressler, L. et al. Comparative assessment of genes driving cancer and somatic evolution in non-cancer tissues: an update of the network of cancer genes (ncg) resource. Genome Biol. 23, 35 (2022).
Article PubMed PubMed Central Google Scholar
Sondka, Z. et al. The cosmic cancer gene census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer 18, 696–705 (2018).
Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385.e18 (2018).
Järvelin, K. & Kekäläinen, J. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst. (TOIS) 20, 422–446 (2002).
Article Google Scholar
Mendiratta, G. et al. Cancer gene mutation frequencies for the us population. Nat. Commun. 12, 5961 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Mayakonda, A., Lin, D.-C., Assenov, Y., Plass, C. & Koeffler, H. P. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 28, 1747–1756 (2018).
Article CAS PubMed PubMed Central Google Scholar
Martens, M. et al. Wikipathways: connecting communities. Nucleic acids Res. 49, D613–D621 (2021).
Article CAS PubMed Google Scholar
Kolberg, L. et al. g: Profiler–interoperable web service for functional enrichment analysis and gene identifier mapping (2023 update). Nucleic acids Res. 51, W207–W212 (2023).
Article CAS PubMed PubMed Central Google Scholar
Köhler, S. et al. The human phenotype ontology in 2021. Nucleic acids Res. 49, D1207–D1217 (2021).
Article ADS PubMed Google Scholar
Jaccard, P. Nouvelles recherches sur la distribution florale. Bull. Soc. Vaud. Sci. Nat. 44, 223–270 (1908).
Google Scholar
Ritchie, M. E. et al. limma powers differential expression analyses for rna-sequencing and microarray studies. Nucleic acids Res. 43, e47–e47 (2015).
Article PubMed PubMed Central Google Scholar
Martinelli, E., Morgillo, F., Troiani, T. & Ciardiello, F. Cancer resistance to therapies against the egfr-ras-raf pathway: The role of mek. Cancer Treat. Rev. 53, 61–69 (2017).
Article CAS PubMed Google Scholar
Jacobsen, K. et al. Convergent akt activation drives acquired egfr inhibitor resistance in lung cancer. Nat. Commun. 8, 410 (2017).
Article ADS PubMed PubMed Central Google Scholar
Farooqi, A. A. et al. Multifunctional roles of betulinic acid in cancer chemoprevention: Spotlight on jak/stat, vegf, egf/egfr, trail/trail-r, akt/mtor and non-coding rnas in the inhibition of carcinogenesis and metastasis. Molecules 28, 67 (2022).
Article PubMed PubMed Central Google Scholar
Matthews, H. K., Bertoli, C. & de Bruin, R. A. Cell cycle control in cancer. Nat. Rev. Mol. Cell Biol. 23, 74–88 (2022).
Article CAS PubMed Google Scholar
Frattini, M., Molinari, F. & Epistolio, S. The role of piccolo in cancer treatment: relationship with egfr and related therapies, and a marker for new targeted therapies. J. Thorac. Dis. 9, 4240 (2017).
Article PubMed PubMed Central Google Scholar
Zhang, W. et al. Piccolo mediates egfr signaling and acts as a prognostic biomarker in esophageal squamous cell carcinoma. Oncogene 36, 3890–3902 (2017).
Article CAS PubMed Google Scholar
Bernhardt, M., Behrens, H.-M., Krüger, S. & Röcken, C. Exploration of the tumor biological significance of pclo in gastric cancer-results from a large central European cohort. Pathobiol: J. Immunopathol. Mol. Cell. Biol. 91, 187–195 (2023).
Eckel-Passow, J. E. et al. Anks1b is a smoking-related molecular alteration in clear cell renal cell carcinoma. BMC Urol. 14, 1–9 (2014).
Article Google Scholar
Peña-Llopis, S. et al. Bap1 loss defines a new class of renal cell carcinoma. Nat. Genet. 44, 751–759 (2012).
Article PubMed PubMed Central Google Scholar
Fu, H. et al. Low expression of mucin-4 predicts poor prognosis in patients with clear-cell renal cell carcinoma. Medicine 95, e3225 (2016).
Article CAS PubMed PubMed Central Google Scholar
Shinagare, A. B. et al. Radiogenomics of clear cell renal cell carcinoma: Preliminary findings of the cancer genome atlas–renal cell carcinoma (tcga–rcc) imaging research group. Abdom. imaging 40, 1684–1692 (2015).
Article PubMed PubMed Central Google Scholar
Bi, K. et al. Tumor and immune reprogramming during immunotherapy in advanced renal cell carcinoma. Cancer cell 39, 649–661 (2021).
Article CAS PubMed PubMed Central Google Scholar
Saad, E. & Katz, A. Progression-free survival and time to progression as primary end points in advanced breast cancer: often used, sometimes loosely defined. Ann. Oncol. 20, 460–464 (2009).
Article CAS PubMed Google Scholar
Marcadis, A. R. et al. Characterizing relative and disease-specific survival in early-stage cancers. JAMA Intern. Med. 180, 461–463 (2020).
Article PubMed Google Scholar
Tong, Y. et al. Vhl regulates the sensitivity of clear cell renal cell carcinoma to sirt4-mediated metabolic stress via hif-1α/ho-1 pathway. Cell Death Dis. 12, 621 (2021).
Article CAS PubMed PubMed Central Google Scholar
Hu, J. et al. Tumor heterogeneity in vhl drives metastasis in clear cell renal cell carcinoma. Signal Transduct. Target. Ther. 8, 155 (2023).
Article CAS PubMed PubMed Central Google Scholar
Liu, X.-D. et al. Pbrm1 loss defines a nonimmunogenic tumor phenotype associated with checkpoint inhibitor resistance in renal carcinoma. Nat. Commun. 11, 2135 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Aili, A., Wen, J., Xue, L. & Wang, J. Mutational analysis of pbrm1 and significance of pbrm1 mutation in anti-pd-1 immunotherapy of clear cell renal cell carcinoma. Front. Oncol. 11, 712765 (2021).
Article CAS PubMed PubMed Central Google Scholar
Xie, Y. et al. Setd2 loss perturbs the kidney cancer epigenetic landscape to promote metastasis and engenders actionable dependencies on histone chaperone complexes. Nat. cancer 3, 188–202 (2022).
Article CAS PubMed PubMed Central Google Scholar
Rao, H. et al. Setd2 deficiency accelerates sphingomyelin accumulation and promotes the development of renal cancer. Nat. Commun. 14, 7572 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Tan, G. et al. The critical role of bap1 mutation in the prognosis and treatment selection of kidney renal clear cell carcinoma. Transl. Androl. Urol. 9, 1725 (2020).
Article PubMed PubMed Central Google Scholar
Feng, Z. et al. Identifying bap1 mutations in clear-cell renal cell carcinoma by ct radiomics: preliminary findings. Front. Oncol. 10, 279 (2020).
Article PubMed PubMed Central Google Scholar
DiNatale, R. G., Sanchez, A., Hakimi, A. A. & Reznik, E. Metabolomics informs common patterns of molecular dysfunction across histologies of renal cell carcinoma. In Urologic Oncology: Seminars and Original Investigations, 38, 755-762 (Elsevier, 2020).
Roldan-Romero, J. M. et al. Pten expression and mutations in tsc1, tsc2 and mtor are associated with response to rapalogs in patients with renal cell carcinoma. Int. J. Cancer 146, 1435–1444 (2020).
Article CAS PubMed Google Scholar
Selvarajah, J., Nathawat, K., Moumen, A., Ashcroft, M. & Carroll, V. Chemotherapy-mediated p53-dependent dna damage response in clear cell renal cell carcinoma: role of the mtorc1/2 and hypoxia-inducible factor pathways. Cell death Dis. 4, e865 (2013).
Article CAS PubMed PubMed Central Google Scholar
Shu, X. et al. Germline genetic variants in somatically significantly mutated genes in tumors are associated with renal cell carcinoma risk and outcome. Carcinogenesis 39, 752–757 (2018).
Article CAS PubMed PubMed Central Google Scholar
Li, Y.-D. et al. Pan-cancer analysis identifies spen mutation as a predictive biomarker with the efficacy of immunotherapy. BMC cancer 23, 793 (2023).
Article CAS PubMed PubMed Central Google Scholar
Bi, M. et al. Genomic characterization of sarcomatoid transformation in clear cell renal cell carcinoma. Proceedings of the National Academy of Sciences of the United States of Americae920-e920 (2016).
Pei, J., Zhang, J. & Cong, Q. Computational analysis of protein–protein interactions of cancer drivers in renal cell carcinoma. FEBS Open Bio 14, 112–126 (2024).
Article CAS PubMed Google Scholar
Guerrero-Martínez, J. A. & Reyes, J. C. High expression of smarca4 or smarca2 is frequently associated with an opposite prognosis in cancer. Sci. Rep. 8, 2043 (2018).
Article ADS PubMed PubMed Central Google Scholar
Cui, H. et al. The swi/snf chromatin remodeling factor dpf3 regulates metastasis of ccrcc by modulating tgf-β signaling. Nat. Commun. 13, 4680 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Park, S. E. et al. mir-96-5p targets pten to mediate sunitinib resistance in clear cell renal cell carcinoma. Sci. Rep. 12, 3537 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Lee, H. J. et al. Prognostic significance of biallelic loss of pten in clear cell renal cell carcinoma. J. Urol. 192, 940–946 (2014).
Article CAS PubMed Google Scholar
Li, Q. et al. Pan-cancer analysis of the pde4dip gene with potential prognostic and immunotherapeutic values in multiple cancers including acute myeloid leukemia. Open Med. 18, 20230782 (2023).
Article CAS Google Scholar
Lin, P. et al. Radiomic profiling of clear cell renal cell carcinoma reveals subtypes with distinct prognoses and molecular pathways. Transl. Oncol. 14, 101078 (2021).
Article CAS PubMed PubMed Central Google Scholar
Gong, Z. et al. Integrative analysis of trpv family to prognosis and immune infiltration in renal clear cell carcinoma. Channels 16, 84–96 (2022).
Article PubMed PubMed Central Google Scholar
Sun, Z. et al. Investigating the prognostic role of lncrnas associated with disulfidptosis-related genes in clear cell renal cell carcinoma. J. Gene Med. 26, e3608 (2024).
Article CAS PubMed Google Scholar
Wang, Z. et al. The role of erbb signaling pathway-related genes in kidney renal clear cell carcinoma and establishing a prognostic risk assessment model for patients. Front. Genet. 13, 862210 (2022).
Article CAS PubMed PubMed Central Google Scholar
He, Y. et al. Targeting pi3k/akt signal transduction for cancer therapy. Signal Transduct. Target. Ther. 6, 425 (2021).
Article CAS PubMed PubMed Central Google Scholar
Braicu, C. et al. A comprehensive review on mapk: a promising therapeutic target in cancer. Cancers 11, 1618 (2019).
Article CAS PubMed PubMed Central Google Scholar
Suchitha, G., Balaya, R. D. A., Raju, R., Keshava Prasad, T. & Dagamajalu, S. A network map of cytoskeleton-associated protein 4 (ckap4) mediated signaling pathway in cancer. J. Cell Commun. Signal. 17, 1097–1104 (2023).
Article CAS PubMed PubMed Central Google Scholar
Zhou, L. et al. Targeting met and axl overcomes resistance to sunitinib therapy in renal cell carcinoma. Oncogene 35, 2687–2697 (2016).
Article CAS PubMed Google Scholar
Ruan, H., Li, S., Bao, L. & Zhang, X. Enhanced yb1/epha2 axis signaling promotes acquired resistance to sunitinib and metastatic potential in renal cell carcinoma. Oncogene 39, 6113–6128 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. J. et al. Von hippel-lindau tumor suppressor gene loss in renal cell carcinoma promotes oncogenic epidermal growth factor receptor signaling via akt-1 and mek-1. Eur. Urol. 54, 845–854 (2008).
Article CAS PubMed PubMed Central Google Scholar
Wen, Y.-C. et al. The oncogenic adamts1–vcan–egfr cyclic axis drives anoikis resistance and invasion in renal cell carcinoma. Cell. Mol. Biol. Lett. 29, 126 (2024).
Article CAS PubMed PubMed Central Google Scholar
Schubach, M., Maass, T., Nazaretyan, L., Röner, S. & Kircher, M. Cadd v1. 7: using protein language models, regulatory cnns and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic acids Res. 52, D1143–D1154 (2024).
Article CAS PubMed PubMed Central Google Scholar
Cao, M. et al. Going the distance for protein function prediction: a new distance metric for protein interaction networks. PloS one 8, e76339 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Ten Kathen, M. J., Reina, D. G. & Flores, I. J. A comparison of pso-based informative path planners for detecting pollution peaks of the ypacarai lake with autonomous surface vehicles. In International Conference on Optimization and Learning OLA, vol. 2022 (2022).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778 (2016).
Bergstra, J. & Bengio, Y. Random search for hyper-parameter optimization. J. Machine Learning Res. 13, 281–305 (2012).
Cheng, X. et al. csurvival: a web resource for biomarker interactions in cancer outcomes and in cell lines. Brief. Bioinforma. 23, bbac090 (2022).
Article Google Scholar
Mantel, N. et al. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother. Rep. 50, 163–170 (1966).
CAS PubMed Google Scholar
Zheng, X. et al. Vinculin orchestrates prostate cancer progression by regulating tumor cell invasion, migration, and proliferation. Prostate 81, 347–356 (2021).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China (92370132 to J.H., 62072376 to J.P., 82070784 to J.A., and 62106172 to Y.Z.), the National Key R&D Program of China (2023YFC3403200 to J.A.), and the Xiaomi Young Talents Program of Xiaomi Foundation to Y.Z.

Author information

These authors contributed equally: Dazhi Lu, Yan Zheng, Xianyanling Yi.

Authors and Affiliations

AI for Science Interdisciplinary Research Center, School of Computer Science, Northwestern Polytechnical University, Xi’an, China
Dazhi Lu, Xi Zeng, Lu Han & Jiajie Peng
College of Intelligence and Computing, Tianjin University, Tianjin, China
Yan Zheng, Jianye Hao & Zhigang Li
Department of Urology, Institute of Urology, West China Hospital, Sichuan University, Chengdu, China
Xianyanling Yi & Jianzhong Ai
School of Software, Northwestern Polytechnical University, Xi’an, China
Shaoqing Jiao
Tianjin Second People’s Hospital, Tianjin, China
Bei Jiang
Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi’an, China
Jiajie Peng

Authors

Dazhi Lu
View author publications
Search author on:PubMed Google Scholar
Yan Zheng
View author publications
Search author on:PubMed Google Scholar
Xianyanling Yi
View author publications
Search author on:PubMed Google Scholar
Jianye Hao
View author publications
Search author on:PubMed Google Scholar
Xi Zeng
View author publications
Search author on:PubMed Google Scholar
Lu Han
View author publications
Search author on:PubMed Google Scholar
Zhigang Li
View author publications
Search author on:PubMed Google Scholar
Shaoqing Jiao
View author publications
Search author on:PubMed Google Scholar
Bei Jiang
View author publications
Search author on:PubMed Google Scholar
Jianzhong Ai
View author publications
Search author on:PubMed Google Scholar
Jiajie Peng
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, J.P., J.H. and J.A.; methodology, J.P., J.H., Y.Z., D.L., Z.L. and L.H.; experimentation, D.L., Y.Z., Z.L., L.H., X.Y. and J.A.; writing-original draft, D.L., X.Y, and J.A.; writing–review and editing, J.P., J.H., J.A., D.L., Y.Z., X.Y, X.Z., B.J. and S.J.; supervision, J.P., J.H. and J.A.

Corresponding authors

Correspondence to Jianye Hao, Jianzhong Ai or Jiajie Peng.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Transparent Peer Review file

Reporting Summary

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Lu, D., Zheng, Y., Yi, X. et al. Identifying potential risk genes for clear cell renal cell carcinoma with deep reinforcement learning. Nat Commun 16, 3591 (2025). https://doi.org/10.1038/s41467-025-58439-5

Download citation

Received: 13 May 2024
Accepted: 18 March 2025
Published: 15 April 2025
DOI: https://doi.org/10.1038/s41467-025-58439-5

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

RL-GenRisk framework

RL-GenRisk shows superior performance for ccRCC risk gene prioritization over existing methods

Biological network facilitates the identification of low-frequency mutated ccRCC risk genes

Biological function analysis of high-confidence risk genes

Differential expression analysis revealed significant differential gene expression of EGFR at both bulk and single-cell levels

Expression level of protein encoded by the EGFR is significantly correlated with the prognosis of ccRCC patients

EGFR effectively promotes ccRCC progression in vitro and in vivo

The PCLO knockdown significantly impaired the ccRCC progression

Discussion

Methods

Data preparation

PPI networks

Gene mutation data

Feature representation

Key elements in RL-GenRisk

State

Action

Reward

Policy network

Policy training for high-confidence risk gene identification

Performance evaluation

Threshold selection for top K genes

Statistical analysis

Gene set enrichment analysis

Survival analysis

In vitro and in vivo experiments

Cell lines

High-content screening analysis

Western blot analysis

Quantitative polymerase chain reaction (qPCR)

Cell counting kit-8 assay

Cell apoptosis

Transwell assay

Colony formation assay

Animal study

Statistical analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links