Abstract
The incidence of nasopharyngeal carcinoma (NPC) has remained static in southern provinces of China and poses a serious threat to public health. Biomarkers and prediction models that can accurately predict the survival of patients with NPC are lacking. In this study, the hub genes of NPC were identified using the gene expression datasets GSE61218 and GSE126683. The expression levels of the hub genes were subsequently determined in clinical samples, and the relationships between the expression levels of the hub genes and patient survival were analysed. Finally, a survival prediction model was constructed using clinical data and hub genes as variables, and the performance of the prediction model was evaluated. AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were identified as hub genes, all of which were significantly upregulated in tumor tissues. The expression levels of AURKA, BUB1, and CDK1 were significantly upregulated in NPC samples from patients in the Death group. The results of the log-rank test suggested that the overall survival rate of patients with high expression levels of AURKA, BUB1, or CDK1 was significantly reduced. Finally, a survival prediction model was constructed using gender, age, T stage, N stage, M stage, BUB1 expression, and AURKA expression as variables. The results of the receiver operating characteristic (ROC) curve, area under the ROC curve, calibration plot, net reclassification index, integrated discrimination improvement index, and decision curve analysis revealed that the model had good discriminating ability, predictive ability, and clinical utility. In conclusion, AURKA, BUB1, and CDK1 are potential prognostic biomarkers of NPC, and a prediction model incorporating the expression levels of AURKA and BUB1 has good discriminating ability, predictive ability, and clinical utility.
Similar content being viewed by others
Introduction
Nasopharyngeal carcinoma (NPC) arises from the nasopharyngeal mucosal lining and is an epithelial carcinoma frequently observed at the pharyngeal recess1. The geographical distribution of NPC is extremely unbalanced, with most new cases of NPC occurring in East and Southeast Asia2. The age-standardized rate of NPC is approximately 3.0 per 100,000 in China, while in populations that are mainly white, the rate is approximately 0.4 per 100,0002. The risk factors for NPC include Epstein–Barr virus (EBV) infection, host genetics, environmental factors, and dietary patterns, all of which contribute to the remarkable geographical distribution of NPC3. It has been reported that the incidence of NPC has gradually declined in some regions, such as North America and Nordic countries4. However, the incidence of NPC has remained static over the past two decades in some southern provinces of mainland China, placing a burden on the medical system5.
The main subtypes of NPC include keratinizing squamous, nonkeratinizing squamous, and basaloid squamous, among which nonkeratinizing squamous cell carcinoma is the most common subtype6. The current hypothesis concerning the pathogenesis of NPC is that nasopharyngeal epithelial cells are infected by EBV and express different viral oncogenic genes, leading to cellular invasive phenotype transformation and NPC progression7. The upregulation of cyclin D1 (CCND1) and/or inactivation of tumor suppressor genes such as transforming growth factor beta receptor 2 (TGFBR2) results in persistent infection with EBV, which promotes unlimited cellular proliferation, resistance to apoptosis, immune dysregulation, inflammation, and genome instability8. A whole-genome sequencing study revealed that the upregulation of EBV-encoded latent membrane protein-1 (LMP-1) activates nuclear factor kappa B (NF-κB) signalling pathway, which is the key oncogenic driver of NPC9,10. In addition, the overexpression of EBV-encoded BNLF2a caused immune evasion and contributed to the progression of NPC11. Recent studies have also demonstrated that the transforming growth factor β (TGFβ) signalling pathway, phosphatidylinositol-3 kinase (PI3K) signalling pathway, and mitogen-activated protein kinase (MAPK) signalling pathway play important roles in the tumorigenesis of NPC12. Despite these findings, the exact mechanism underlying the tumorigenesis of NPC is not yet clear and needs further exploration.
Owing to the high sensitivity of NPC to ionizing radiation, radiotherapy is the key treatment for NPC13. With the development of technology, radiotherapy has progressed from traditional two-dimensional radiotherapy to three-dimensional conformal radiotherapy and then intensity-modulated radiotherapy14. Intensity-modulated radiotherapy is currently the most widely used treatment, and intensity-modulated radiotherapy can reduce the 5-year occurrence rate of patients with NPC15. Moreover, compared with two-dimensional and three-dimensional radiotherapy, intensity-modulated radiotherapy is significantly related to better 5-year locoregional control and overall survival16. Currently, radiotherapy combined with chemotherapy is important for treating locoregionally advanced NPC17.
However, biomarkers that can accurately predict the outcome after treatment and the survival of patients with NPC are lacking3. In this study, the hub genes of NPC were identified using bioinformatics analysis and validated in clinical NPC samples. Next, using hub genes and clinical data, we constructed a prediction model for the survival of patients with NPC who underwent radiotherapy, and the performance of the model was evaluated.
Results
Identification of hub genes of NPC
The GSE61218 and GSE126683 data were normalized and merged after the batch effect was removed. The differentially expressed genes (DEGs) of NPC were screened using the thresholds of p < 0.05 and |log2foldchange| > 1. The results revealed 2080 DEGs between the normal and tumor groups; 736 DEGs were downregulated in the tumor group, whereas 1344 DEGs were upregulated in the tumor group (Fig. 1A). The results of the Gene Ontology (GO) enrichment analysis indicated that the DEGs were significantly associated with deoxyribonucleic acid (DNA) replication, mitotic nuclear division, and nuclear division (Fig. 1B). In addition, Kyoto Encyclopedia of Genes and Genomes (KEGG)18,19 enrichment analysis revealed that the DEGs were significantly related to signalling pathways, including the cell cycle, DNA replication, p53 signalling pathway, and mismatch repair (Fig. 1C). The hub genes were subsequently identified on the basis of different methods of protein–protein interaction network (Fig. 1D, E). AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were identified as hub genes, all of which were significantly upregulated in the tumor group (Fig. 1F).
Hub genes of NPC. (A) Volcano plot of the DEGs. (B) Results of the GO enrichment analysis of the DEGs. (C) Results of the KEGG (www.kegg.jp/kegg/kegg1.html) enrichment analysis of the DEGs. (D) Flowchart of identification of hub genes. (E) Hub genes were identified by taking the intersection of the top 20 genes derived from different methods of cytoHubba. (F) The expression levels of the hub genes in GSE61218 and GSE126683. ***, p < 0.001 vs. the normal group.
High expression levels of hub genes were significantly associated with a low overall survival rate
To further explore the potential roles of these hub genes in predicting survival, we collected tumor samples from 120 patients with NPC. The baseline data of these patients are displayed in Table 1. The median follow-up time was 2669 days, and 26 patients died during follow-up. The number and size of tumor were significantly greater in the Death group. Similarly, the T, N, and M stages of NPC were significantly greater in the Death group. In addition, there was no significant difference in the pathological type between the two groups of patients. The expression levels of the hub genes were subsequently validated in clinical samples. Immunofluorescence staining revealed that the protein expression levels of AURKA, BUB1, and CDK1 were significantly upregulated in the NPC tissues of patients in the Death group, whereas the expression levels of CCNA2 and CCNB2 did not significantly differ between the NPC tissues from Control group and Death group (Fig. 2A–E). Thereafter, the relationships between these hub genes and the overall survival rate were determined. The results of the log-rank test suggested that the overall survival rate of patients with high expression levels of AURKA, BUB1, or CDK1 was significantly lower than that of patients with low expression levels of AURKA, BUB1, or CDK1, while the expression level of CCNA2 or CCNB2 was not significantly correlated with the overall survival rate (Fig. 2F–J). These results indicated that the expression levels of AURKA, BUB1, and CDK1 might be predictive factors for the survival of patients with NPC.
High expression levels of hub genes were significantly associated with a low overall survival rate. (A–E) Representative images of immunofluorescence staining for AURKA, BUB1, CDK1, CCNA2, and CCNB2 in NPC samples. (F–J) Kaplan–Meier curves of patients with different expression levels of AURKA, BUB1, CDK1, CCNA2, and CCNB2. Control: nasopharyngeal carcinoma tissues from patients who survived during follow-up. Death: nasopharyngeal carcinoma tissues from patients who died during follow-up. ns, not significant. **, p < 0.01 vs. Control group. ***, p < 0.001 vs. Control group.
Construction and evaluation of the survival prediction model based on the hub genes
We constructed a survival prediction model based on the hub genes. Age and gender were included in the prediction model as variables. The other variables were screened by univariate Cox regression analysis with a threshold of p value < 0.15 (Supplementary Table 1). Variables that were inconsistent with clinical experience were excluded (Supplementary Table 1). Finally, gender, age, T stage, N stage, M stage, BUB1 expression, and AURKA expression were used to construct the survival prediction model (Model 1; Table 2). The receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) of Model 1 were calculated, and the results revealed that the areas under the curve (AUC) values for predicting survival at 1500, 2000, and 3000 days were 0.832, 0.927, and 0.939, respectively, exhibiting good predictive ability (Fig. 3A). To further explore whether the inclusion of BUB1 and AURKA improved the performance of the prediction model, we constructed another prediction model (Model 2; Supplementary Table 2) that included gender, age, T stage, N stage, and M stage as variables. The ROC curve and AUC of Model 2 were calculated and are displayed in Fig. 3B, and its AUC values were significantly lower than those of Model 1 (p < 0.001). The calibration plot indicated that Model 1 was better calibrated than Model 2 (Fig. 3C). The net reclassification index (NRI) and integrated discrimination improvement index (IDI) of Model 1 vs. Model 2 were then calculated, and the values of the NRI and IDI were 0.233 and 0.194, respectively, both of which indicated that the inclusion of BUB1 and AURKA significantly improved the performance of the prediction model (Table 3). Finally, the results of decision curve analysis (DCA) revealed that compared with Model 2, Model 1 resulted in greater clinical net benefits (Fig. 3D). In conclusion, the inclusion of BUB1 and AURKA significantly improved the discriminating ability, predictive ability, and clinical utility of the prediction model.
Evaluation of the prediction model. (A) ROC curve and AUC of the prediction model including hub genes as variables (Model 1). (B) ROC curve and AUC of the prediction model not including hub genes as variables (Model 2). (C) Calibration plots of Model 1 and Model 2. (D) The decision curves of Model 1 and Model 2.
Discussion
NPC is endemic to Southeast Asia and North Africa2. The incidence of NPC has remained static in Southern China and poses a serious threat to people’s health5. Treatments for NPC have been developed in the past two decades7, but there is a lack of biomarkers that can accurately predict the treatment outcome and survival of patients with NPC. In this study, the DEGs of NPC were first analysed using Gene Expression Omnibus (GEO) gene expression datasets. AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were subsequently identified as hub genes by different protein‒protein interaction networks. The expression levels of AURKA, BUB1, and CDK1 were significantly increased in samples from patients who died during follow-up. The overall survival rate of patients with high expression levels of AURKA, BUB1, or CKD1 was significantly lower than that of patients with low expression levels of these genes. Finally, a survival prediction model was constructed with baseline clinical data and hub gene expression levels. The performance of the model was evaluated, and the results revealed that the prediction model has good discriminating ability, predictive ability, and clinical utility.
Infection with EBV is the primary pathogenic factor of NPC20. EBV mainly exists as a latent infection in NPC and expresses viral proteins, including EBNA1, LMP1, and LMP2, which play crucial roles in the tumorigenesis and development of NPC20. Among these viral proteins, LMP1 is regarded as one of the most important oncogenic proteins21. LMP1 can simulate the function of tumor necrosis factor receptor (TNFR) and activate the NF-κB, ERK/MAPK, JNK, JAK-STAT, p38/MAPK, and PI3K/Akt pathways, all of which promote tumor cell survival and proliferation21. In addition, LMP1 can regulate the expression of proinflammatory factors such as interleukin-6 (IL-6), IL-8, and macrophage inflammatory protein 1-α22. These proinflammatory factors can recruit T cells and macrophages and significantly affect the tumor microenvironment22. Moreover, proinflammatory factors can also induce the growth, migration and invasion of tumor cells22. In addition, LMP1 participates in the reprogramming of glycolysis to provide enough energy for the proliferation of tumor cells23. Infection with EBV is also related to anoikis resistance and immune evasion11,23. EBNA1 is expressed in most EBV-associated tumor and significantly contributes to the maintenance, replication, and transcription of the viral genome11. In this study, the DEGs of NPC were found to be significantly enriched in biological processes such as DNA replication and the cell cycle. These results further support the hypothesis that EBV infection induces the expression of virus-related oncogenes, which lead to the progression and development of NPC.
Since the DEGs of NPC were significantly related to DNA replication and the cell cycle of tumor cells, the authors speculated that the hub genes of the DEGs might be potential prognostic biomarkers for patients with NPC. AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were identified as hub genes, and all the hub genes were upregulated in tumor groups. AURKA and AURKB are members of the serine/threonine kinase family and share a highly conserved catalytic domain containing autophosphorylation sites24. Moreover, both AURKA and AURKB play crucial roles in the cell cycle24,25. According to the Cancer Genome Atlas (TCGA) UALCAN database, AURKA is expressed in many kinds of tumor, such as rectal adenocarcinoma24. AURKA expression is significantly upregulated in bladder urothelial carcinoma, invasive breast carcinoma, cholangiocarcinoma, and colon adenocarcinoma tissues compared with corresponding normal tissues24. AURKB is also significantly upregulated in a variety of tumor24. The function of AURKB is similar to that of AURKA24. Therefore, the authors focused only on AURKA for further exploration in this study. BUB1 and its paralogous homologue BUB1B are members of the spindle assembly checkpoint (SAC) protein family, both of which can prevent premature mitotic chromosome segregation and reduce aneuploidy26. The interaction between BUB1 and BUB1B is mediated by a conserved N-terminal region. This interaction is important for the localization of the mitotic checkpoint kinetochore26. Mutations in the BUB1 and BUB1B genes have been identified in tumor27. The upregulation of BUB1 can induce the proliferation and invasion of gastric tumor cells via the Wnt/β-catenin signalling pathway, whereas the downregulation of BUB1 can lead to S-phase arrest in liver tumor cells28,29. Similarly, the upregulation of BUB1B was related to the proliferation of myeloma cells via the CDC20/CCNB signalling pathway30. Considering the similarity in the functions of BUB1 and BUB1B, the authors selected BUB1 for further evaluation. CCNA2 is a member of the Cyclin A family and participates in cell cycle regulation31. Studies have suggested that CCNA2 is involved in the occurrence and progression of many types of tumors through the induction of epithelial–mesenchymal transformation and metastasis32. CCNB2 is a member of the cell cycle protein family and primarily controls the G2/M phase transition33. Many studies have shown that CCNB2 is aberrantly expressed in a variety of tumor, including glioblastoma and non-small cell lung cancer34,35. In addition, the upregulation of CCNB2 was associated with an accelerated proliferation rate of tumor cells35. CDK1 is a member of the cyclin-dependent kinase family and is a serine/threonine kinase that forms a complex with cyclin proteins to regulate the cell cycle36. In addition, CDK1 is the only CDK in mammals that is necessary for the cell cycle and induces G2/M and G1/S transitions and G1 progression36. The dysregulation of CDK1 leads to unrestricted cell proliferation, which ultimately results in the occurrence of tumor36.
Since these hub genes are closely related to the progression of tumor, the authors speculated that these hub genes might have potential predictive value for the survival of patients with NPC. Therefore, the authors further validated the expression levels of the hub genes in patients. The authors collected tumor samples from 120 patients with NPC. The median follow-up time was 2669 days, and 26 patients died during follow-up. The results of multiplex immunofluorescence showed that the expression levels of AURKA, BUB1, and CDK1 were significantly upregulated in the Death group of patients with NPC. Afterwards, the authors constructed a survival prediction model based on hub genes and baseline clinical data. gender, age, T, N, M, BUB1, and AURKA were selected to construct the survival prediction model. As mentioned above, the upregulation of BUB1 contributes to the development of tumor. Zhang et al. reported that BUB1 expression is upregulated in endometrial cancer, is significantly related to the infiltration of T cells in the tumor microenvironment, and is correlated with the prognosis of patients with endometrial cancer37. In addition, Chen et al. reported that BUB1 was significantly correlated with the overall survival rate of patients with breast cancer and might be a prognostic biomarker for patients with breast cancer38. Moreover, a bioinformatics study indicated that BUB1 has potential predictive value for the survival of patients with NPC39. Moreover, studies have indicated that AURKA expression can predict the outcome of patients with breast cancer40. Studies have also suggested that AURKA might be a potential prognostic biomarker for NPC41. In our study, we validated the predictive values of BUB1 and AURKA in a clinical cohort. Finally, the performance of the final prediction model was evaluated using ROC, AUC, calibration plot, and DCA, and the results revealed that the prediction model has good discriminating ability, predictive ability, and clinical utility.
However, there were several limitations in this study. First, owing to the lack of healthy nasopharyngeal tissue samples, the protein expression levels of the hub genes were not compared between healthy nasopharyngeal tissue and NPC tissue. Second, the sample size of the study population was relatively small, and multicentre studies are needed in the further research. In addition, the functions of AURKA and BUB1 in NPC cell lines have not been explored.
Conclusion
In this study, the DEGs of NPC were identified and analysed, and they were found to be significantly enriched in biological processes such as DNA replication. Then, AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were identified as hub genes. The overall survival rate of patients with high expression levels of AURKA, BUB1, or CDK1 was significantly reduced. Finally, a survival prediction model was constructed with hub genes and clinical data, which had good discriminating ability, predictive ability, and clinical utility.
Methods
Identifying hub genes
The Gene Expression Omnibus database (GEO database, https://www.ncbi.nlm.nih.gov/gds) was searched for gene expression datasets of NPC according to the following criteria: (1) Search term: Nasopharyngeal carcinoma, (2) Top Organisms: Homo sapiens, (3) Study type: Expression profiling by array, (4) Attribute name: Tissue, (5) Sample count: From 6 to 1000, and (6) datasets containing both NPC tissues and normal healthy nasopharyngeal tissues. GSE61218 and GSE126683 met all the criteria. The gene expression datasets of GSE61218 and GSE126683 were subsequently downloaded from the GEO database. The GSE61218 dataset contains six normal healthy nasopharyngeal tissue samples and ten NPC samples. The GSE126683 dataset contains three normal healthy nasopharyngeal tissue samples and three NPC samples. The raw data of these datasets were normalized using the “Lumi” package in R software (R Core Team; R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing; Vienna, Austria; https://www.R-project.org/; version 4.1.2). Afterwards, the data were annotated using the “dplyr” and “limma” packages in R software. The batch effect between each dataset was removed using the “sva” package in R software, and these datasets were merged for further analysis.
The differentially expressed genes (DEGs) were identified using “limma” packages in R software with thresholds of p < 0.05 and |log2fold change| > 1. A protein‒protein interaction network of the DEGs was constructed using the Search Tool for the Retrieval of Interacting Genes online tool (STRING, https://cn.string-db.org/). The minimum required interaction score applied in STRING was medium confidence (0.400). The full STRING network, which included both functional and physical associations, was used in this study. In addition, the active interaction sources included text mining, experiments, databases, coexpression, neighbourhood, gene fusion, and co-occurrence. The protein‒protein interaction network of the DEGs was then visualized using Cytoscape software (version 3.9.1)42. The “node’s score” of the network was calculated using cytoHubba (ver. 0.1), which is a plugin of Cytoscape software43. In this study, the top 20 DEGs were identified using the “degree”, “EPC”, “MCC”, and “MNC” methods via cytoHubba. The “degree” method takes the number of directly connected edges of a node as the core indicator to identify key nodes with the most direct connections in the network. The “EPC” method focuses on the anti-interference stability of the network and identifies nodes that can maintain the connectivity of network components. The “MCC” method captures key nodes with both connectivity and local network density advantages by evaluating the central position of nodes in the maximal cliques of the network. The “MNC” method measures the centrality of nodes in their maximum neighbourhood components to explore key nodes that play a leading role in local subnetworks. Finally, the intersection of the four sets of the top 20 genes was taken to obtain the hub genes.
Study population
This study was approved by the Ethics Committee of Xiangya Hospital, Central South University, and was performed in accordance with the Declaration of Helsinki. Informed consent was obtained from all participants. Patients with NPC who underwent radiotherapy at Xiangya Hospital, Central South University, from 2008 to 2013 were included in this study. The exclusion criteria were as follows: age ≤ 18 years or ≥ 80 years, a diagnosis of other tumor, pregnancy, and missing data. NPC samples were obtained by biopsy before treatment. Baseline data and clinical data were recorded. Telephone follow-ups were conducted.
Multiplex immunofluorescence
Formalin-fixed, paraffin-embedded NPC slides were deparaffinized and hydrated. Multiplex immunofluorescence was conducted using an Opal™ 7-Color Manual IHC Kit according to the manufacturer’s recommended procedures. Briefly, the slides were heated in AR buffer using a microwave. After they cooled, the slides were blocked with normal goat serum. The primary antibody was then incubated with the slides overnight. After being rinsed, the slides were incubated with the secondary Polymer HRP Ms + Rb and Opal Fluorophore Working Solution to generate specific Opal signals for each target. After that, the slides were heated in AR buffer using a microwave to strip the primary–secondary–HRP complex, allowing the introduction of the next primary antibody. Then, the same procedure was repeated, starting with the blocking agent and followed by primary antibody incubation and secondary Polymer HRP Ms + Rb and Opal Fluorophore Working Solution incubation to generate specific Opal signal for all targets. Finally, DAPI Working Solution was applied to the slides and the slides were then mounted with mounting medium. Images were obtained via Vectra Quantitative Pathology Imaging Systems, and the fluorescence intensity was measured using ImageJ software. The primary antibodies used in this study were as follows: anti-Aurora A (1:200 dilution; #14475; Cell Signaling Technology), anti-BUB1 (1:200 dilution; #94244; Cell Signaling Technology), anti-CDK1 (1:200 dilution; #9116; Cell Signaling Technology), anti-CCNA2 (1:200 dilution; #67955; Cell Signaling Technology), and anti-CCNB2 (1:200 dilution; #12231; Cell Signaling Technology).
Construction and evaluation of prediction model
Age and gender were included in the prediction model as variables. Univariate Cox regression analysis was then used to screen variables for the predictive model with a p value < 0.15. Multivariate Cox regression analysis was used to construct a prediction model with gender, age, T stage, N stage, M stage, BUB1 expression, and AURKA expression. The discriminating ability of the prediction model was first evaluated by the receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) using the “ROCR” package in R software (version 4.1.2). In addition, the differences between the AUCs of the different models were compared by the Delong method using the “pROC” package in R software (version 4.1.2). Afterwards, a calibration plot with the “boot” method using 1000 replications was applied to evaluate the calibration of the models using the “rms” package in R software (version 4.1.2). The net reclassification index (NRI) and integrated discrimination improvement index (IDI) were further utilized to evaluate the additional predictive ability of the model after the inclusion of the hub genes by using the “nricens” and “PredictABEL” packages in R software (version 4.1.2). Finally, decision curve analysis (DCA) was applied to analyse the clinical utility of the models using the “rmda” package in R software (version 4.1.2).
Statistical analysis
Statistical analysis was performed using SPSS version 19 (IBM Corporation, Armonk, NY, USA), R software (version 4.1.2), and GARPHPAD (version 8.0). Continuous data are expressed as means ± standard deviations (SDs). Count data are expressed as frequencies (percentages). Student’s t test was used to compare continuous data with a normal distribution between different groups, and Mann–Whitney U tests were used to compare continuous data with a nonnormal distribution. For count data, the chi-square test was used to compare the difference in frequencies between groups. Cox regression analysis was used to construct the prediction models. The performance of the models was determined using ROC curves, AUC, NRI, IDI, and DCA. A Kaplan–Meier curve (log-rank test) was used to compare the survival rates between different groups. A value of p < 0.05 was considered to indicate statistical significance.
Data availability
The gene expression datasets of GSE61218 and GSE126683 are available in the GEO repository (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE61218, and https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE126683). For requestion of data from this study, please contact the corresponding author, Yubin Chen.
References
Tang, L. L. et al. Validation of the 8th edition of the UICC/AJCC staging system for nasopharyngeal carcinoma from endemic areas in the intensity-modulated radiotherapy era. J. Natl. Compr. Canc Netw. 15(7), 913–919 (2017).
Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68(6), 394–424 (2018).
Chen, Y. P. et al. Nasopharyngeal carcinoma. Lancet 394(10192), 64–80 (2019).
Tang, L. L. et al. Global trends in incidence and mortality of nasopharyngeal carcinoma. Cancer Lett. 374(1), 22–30 (2016).
Wei, K. R. et al. Epidemiological trends of nasopharyngeal carcinoma in China. Asian Pac. J. Cancer Prev. 11(1), 29–32 (2010).
Luo, W. Nasopharyngeal carcinoma ecology theory: cancer as multidimensional spatiotemporal unity of ecology and evolution pathological ecosystem. Theranostics 13(5), 1607–1631 (2023).
Li, W. et al. Immunotherapeutic approaches in EBV-associated nasopharyngeal carcinoma. Front. Immunol. 13, 1079515 (2022).
Tsang, C. M., Lui, V. W. Y., Bruce, J. P., Pugh, T. J. & Lo, K. W. Translational genomics of nasopharyngeal cancer. Semin Cancer Biol. 61, 84–100 (2020).
Dai, W. et al. Whole-exome sequencing identifies MST1R as a genetic susceptibility gene in nasopharyngeal carcinoma. Proc. Natl. Acad. Sci. U S A. 113(12), 3317–3322 (2016).
Li, Y. Y. et al. Exome and genome sequencing of nasopharynx cancer identifies NF-κB pathway activating mutations. Nat. Commun. 8, 14121 (2017).
Hau, P. M. et al. Targeting Epstein-Barr virus in nasopharyngeal carcinoma. Front. Oncol. 10, 600 (2020).
Wong, K. C. W. et al. Nasopharyngeal carcinoma: an evolving paradigm. Nat. Rev. Clin. Oncol. 18(11), 679–695 (2021).
Peng, G. et al. A prospective, randomized study comparing outcomes and toxicities of intensity-modulated radiotherapy vs. conventional two-dimensional radiotherapy for the treatment of nasopharyngeal carcinoma. Radiother Oncol. 104(3), 286–293 (2012).
Co, J., Mejia, M. B. & Dizon, J. M. Evidence on effectiveness of intensity-modulated radiotherapy versus 2-dimensional radiotherapy in the treatment of nasopharyngeal carcinoma: Meta-analysis and a systematic review of the literature. Head Neck. 38(Suppl 1), E2130–E2142 (2016).
Mao, Y. P. et al. Prognostic factors and failure patterns in non-metastatic nasopharyngeal carcinoma after intensity-modulated radiotherapy. Chin. J. Cancer. 35(1), 103 (2016).
Zhang, B. et al. Intensity-modulated radiation therapy versus 2D-RT or 3D-CRT for the treatment of nasopharyngeal carcinoma: A systematic review and meta-analysis. Oral Oncol. 51(11), 1041–1046 (2015).
Colevas, A. D. et al. NCCN Guidelines® insights: head and neck cancers, version 2.2025. J. Natl. Compr. Canc Netw. 23(2), 2–11 (2025).
Kanehisa, M., Furumichi, M., Sato, Y., Matsuura, Y. & Ishiguro-Watanabe, M. KEGG: biological systems database as a model of the real world. Nucleic Acids Res. 53(D1), D672–d7 (2025).
Kanehisa, M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 28(11), 1947–1951 (2019).
Young, L. S., Yap, L. F. & Murray, P. G. Epstein-Barr virus: more than 50 years old and still providing surprises. Nat. Rev. Cancer. 16(12), 789–802 (2016).
Shair, K. H. Y., Reddy, A. & Cooper, V. S. New insights from elucidating the role of LMP1 in nasopharyngeal carcinoma. Cancers. ;10(4) (2018).
Yi, M. et al. Rediscovery of NF-κB signaling in nasopharyngeal carcinoma: how genetic defects of NF-κB pathway interplay with EBV in driving oncogenesis? J. Cell. Physiol. 233(8), 5537–5549 (2018).
Lo, A. K., Dawson, C. W., Lung, H. L., Wong, K. L. & Young, L. S. The role of EBV-encoded LMP1 in the NPC tumor microenvironment: from function to therapy. Front. Oncol. 11, 640207 (2021).
Du, R., Huang, C., Liu, K., Li, X. & Dong, Z. Targeting AURKA in cancer: molecular mechanisms and opportunities for cancer therapy. Mol. Cancer. 20(1), 15 (2021).
Otto, T. & Sicinski, P. Cell cycle proteins as promising targets in cancer therapy. Nat. Rev. Cancer. 17(2), 93–115 (2017).
Skowyra, A., Allan, L. A., Saurin, A. T. & Clarke, P. R. USP9X limits mitotic checkpoint complex turnover to strengthen the spindle assembly checkpoint and guard against chromosomal instability. Cell. Rep. 23(3), 852–865 (2018).
Kim, T. & Gartner, A. Bub1 kinase in the regulation of mitosis. Anim. Cells Syst. (Seoul). 25(1), 1–10 (2021).
Grabsch, H. et al. Overexpression of the mitotic checkpoint genes BUB1, BUBR1, and BUB3 in gastric cancer–association with tumour cell proliferation. J. Pathol. 200(1), 16–22 (2003).
Qiu, J. et al. BUB1B promotes hepatocellular carcinoma progression via activation of the mTORC1 signaling pathway. Cancer Med. 9(21), 8159–8172 (2020).
Zhou, X. et al. BUB1B (BUB1 Mitotic Checkpoint Serine/Threonine Kinase B) promotes lung adenocarcinoma by interacting with zinc finger protein ZNF143 and regulating Glycolysis. Bioengineered 13(2), 2471–2485 (2022).
He, Q. et al. Smoking-induced CCNA2 expression promotes lung adenocarcinoma tumorigenesis by boosting AT2/AT2-like cell differentiation. Cancer Lett. 592, 216922 (2024).
Ershov, P., Poyarkov, S., Konstantinova, Y., Veselovsky, E. & Makarova, A. Transcriptomic signatures in colorectal cancer progression. Curr. Mol. Med. 23(3), 239–249 (2023).
Hu, M. et al. Knockdown of CCNB2 inhibits the tumorigenesis of gastric cancer by regulation of the PI3K/Akt pathway. Sci. Rep. 15(1), 5703 (2025).
Takashima, S. et al. Strong expression of cyclin B2 mRNA correlates with a poor prognosis in patients with non-small cell lung cancer. Tumour Biol. 35(5), 4257–4265 (2014).
Wang, D. et al. CCNB2 is a novel prognostic factor and a potential therapeutic target in low-grade glioma. Biosci. Rep. ;42(1). (2022).
Wang, Q., Bode, A. M. & Zhang, T. Targeting CDK1 in cancer: mechanisms and implications. NPJ Precis Oncol. 7(1), 58 (2023).
Zhang, H., Li, Y. & Lu, H. Correlation of BUB1 and BUB1B with the development and prognosis of endometrial cancer. Sci. Rep. 14(1), 17084 (2024).
Chen, D. L., Cai, J. H. & Wang, C. C. N. Identification of key prognostic genes of triple negative breast cancer by LASSO-Based machine learning and bioinformatics analysis. Genes. 13(5) (2022).
Liu, K., Kang, M., Zhou, Z., Qin, W. & Wang, R. Bioinformatics analysis identifies hub genes and pathways in nasopharyngeal carcinoma. Oncol. Lett. 18(4), 3637–3645 (2019).
Kahl, I. et al. The cell cycle-related genes RHAMM, AURKA, TPX2, PLK1, and PLK4 are associated with the poor prognosis of breast cancer patients. J. Cell. Biochem. 123(3), 581–600 (2022).
Jiang, D. et al. AURKA, as a potential prognostic biomarker, regulates autophagy and immune infiltration in nasopharyngeal carcinoma. Immunobiology 228(2), 152314 (2023).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13(11), 2498–2504 (2003).
Chin, C. H. et al. CytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 8(Suppl 4), S11 (2014).
Acknowledgements
The authors thank Yangjie Zhou from Xiangya Hospital, Central South University for the helping and advice during conducting this study.
Funding
This study was funded by the National Natural Science Foundation of China (82370642 and 82501919) and Natural Science Foundation of Hunan (2024JJ5611 and 2025JJ60533).
Author information
Authors and Affiliations
Contributions
JFZ and YZF conducted experiments, processing data analysis, completing figures and writing the manuscript. ZWZ conducted the bioinformatics analysis. YBC and CET designed the study and reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhu, J., Feng, Y., Zhu, Z. et al. Identification of hub genes and construction of a survival prediction model for patients with nasopharyngeal carcinoma. Sci Rep 16, 5299 (2026). https://doi.org/10.1038/s41598-026-36395-4
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-36395-4





