Identification of hub genes and construction of a survival prediction model for patients with nasopharyngeal carcinoma

Zhu, Jinfei; Feng, Yizhuo; Zhu, Zhanwei; Chen, Yubin; Tang, Can-e

doi:10.1038/s41598-026-36395-4

Download PDF

Article
Open access
Published: 16 January 2026

Identification of hub genes and construction of a survival prediction model for patients with nasopharyngeal carcinoma

Jinfei Zhu¹^na1,
Yizhuo Feng²^na1,
Zhanwei Zhu³,
Yubin Chen⁴ &
…
Can-e Tang^2,3,5

Scientific Reports volume 16, Article number: 5299 (2026) Cite this article

1668 Accesses
1 Citations
7 Altmetric
Metrics details

Subjects

Abstract

The incidence of nasopharyngeal carcinoma (NPC) has remained static in southern provinces of China and poses a serious threat to public health. Biomarkers and prediction models that can accurately predict the survival of patients with NPC are lacking. In this study, the hub genes of NPC were identified using the gene expression datasets GSE61218 and GSE126683. The expression levels of the hub genes were subsequently determined in clinical samples, and the relationships between the expression levels of the hub genes and patient survival were analysed. Finally, a survival prediction model was constructed using clinical data and hub genes as variables, and the performance of the prediction model was evaluated. AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were identified as hub genes, all of which were significantly upregulated in tumor tissues. The expression levels of AURKA, BUB1, and CDK1 were significantly upregulated in NPC samples from patients in the Death group. The results of the log-rank test suggested that the overall survival rate of patients with high expression levels of AURKA, BUB1, or CDK1 was significantly reduced. Finally, a survival prediction model was constructed using gender, age, T stage, N stage, M stage, BUB1 expression, and AURKA expression as variables. The results of the receiver operating characteristic (ROC) curve, area under the ROC curve, calibration plot, net reclassification index, integrated discrimination improvement index, and decision curve analysis revealed that the model had good discriminating ability, predictive ability, and clinical utility. In conclusion, AURKA, BUB1, and CDK1 are potential prognostic biomarkers of NPC, and a prediction model incorporating the expression levels of AURKA and BUB1 has good discriminating ability, predictive ability, and clinical utility.

BUB1 promotes cell stem-like properties and serves as a diagnostic biomarker for lung cancer

Article Open access 12 February 2026

Screening of co-expressed genes in hypopharyngeal carcinoma with esophageal carcinoma based on RNA sequencing and Clinical Research

Article Open access 14 June 2024

Dysregulated expression of cell cycle regulators CDC20, PLK1, BUB1, CDC45, CDCA5 in pancreatic ductal adenocarcinoma

Article Open access 17 February 2026

Introduction

Nasopharyngeal carcinoma (NPC) arises from the nasopharyngeal mucosal lining and is an epithelial carcinoma frequently observed at the pharyngeal recess¹. The geographical distribution of NPC is extremely unbalanced, with most new cases of NPC occurring in East and Southeast Asia². The age-standardized rate of NPC is approximately 3.0 per 100,000 in China, while in populations that are mainly white, the rate is approximately 0.4 per 100,000². The risk factors for NPC include Epstein–Barr virus (EBV) infection, host genetics, environmental factors, and dietary patterns, all of which contribute to the remarkable geographical distribution of NPC³. It has been reported that the incidence of NPC has gradually declined in some regions, such as North America and Nordic countries⁴. However, the incidence of NPC has remained static over the past two decades in some southern provinces of mainland China, placing a burden on the medical system⁵.

The main subtypes of NPC include keratinizing squamous, nonkeratinizing squamous, and basaloid squamous, among which nonkeratinizing squamous cell carcinoma is the most common subtype⁶. The current hypothesis concerning the pathogenesis of NPC is that nasopharyngeal epithelial cells are infected by EBV and express different viral oncogenic genes, leading to cellular invasive phenotype transformation and NPC progression⁷. The upregulation of cyclin D1 (CCND1) and/or inactivation of tumor suppressor genes such as transforming growth factor beta receptor 2 (TGFBR2) results in persistent infection with EBV, which promotes unlimited cellular proliferation, resistance to apoptosis, immune dysregulation, inflammation, and genome instability⁸. A whole-genome sequencing study revealed that the upregulation of EBV-encoded latent membrane protein-1 (LMP-1) activates nuclear factor kappa B (NF-κB) signalling pathway, which is the key oncogenic driver of NPC^9,10. In addition, the overexpression of EBV-encoded BNLF2a caused immune evasion and contributed to the progression of NPC¹¹. Recent studies have also demonstrated that the transforming growth factor β (TGFβ) signalling pathway, phosphatidylinositol-3 kinase (PI3K) signalling pathway, and mitogen-activated protein kinase (MAPK) signalling pathway play important roles in the tumorigenesis of NPC¹². Despite these findings, the exact mechanism underlying the tumorigenesis of NPC is not yet clear and needs further exploration.

Owing to the high sensitivity of NPC to ionizing radiation, radiotherapy is the key treatment for NPC¹³. With the development of technology, radiotherapy has progressed from traditional two-dimensional radiotherapy to three-dimensional conformal radiotherapy and then intensity-modulated radiotherapy¹⁴. Intensity-modulated radiotherapy is currently the most widely used treatment, and intensity-modulated radiotherapy can reduce the 5-year occurrence rate of patients with NPC¹⁵. Moreover, compared with two-dimensional and three-dimensional radiotherapy, intensity-modulated radiotherapy is significantly related to better 5-year locoregional control and overall survival¹⁶. Currently, radiotherapy combined with chemotherapy is important for treating locoregionally advanced NPC¹⁷.

However, biomarkers that can accurately predict the outcome after treatment and the survival of patients with NPC are lacking³. In this study, the hub genes of NPC were identified using bioinformatics analysis and validated in clinical NPC samples. Next, using hub genes and clinical data, we constructed a prediction model for the survival of patients with NPC who underwent radiotherapy, and the performance of the model was evaluated.

Results

Identification of hub genes of NPC

The GSE61218 and GSE126683 data were normalized and merged after the batch effect was removed. The differentially expressed genes (DEGs) of NPC were screened using the thresholds of p < 0.05 and |log2foldchange| > 1. The results revealed 2080 DEGs between the normal and tumor groups; 736 DEGs were downregulated in the tumor group, whereas 1344 DEGs were upregulated in the tumor group (Fig. 1A). The results of the Gene Ontology (GO) enrichment analysis indicated that the DEGs were significantly associated with deoxyribonucleic acid (DNA) replication, mitotic nuclear division, and nuclear division (Fig. 1B). In addition, Kyoto Encyclopedia of Genes and Genomes (KEGG)^18,19 enrichment analysis revealed that the DEGs were significantly related to signalling pathways, including the cell cycle, DNA replication, p53 signalling pathway, and mismatch repair (Fig. 1C). The hub genes were subsequently identified on the basis of different methods of protein–protein interaction network (Fig. 1D, E). AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were identified as hub genes, all of which were significantly upregulated in the tumor group (Fig. 1F).

High expression levels of hub genes were significantly associated with a low overall survival rate

To further explore the potential roles of these hub genes in predicting survival, we collected tumor samples from 120 patients with NPC. The baseline data of these patients are displayed in Table 1. The median follow-up time was 2669 days, and 26 patients died during follow-up. The number and size of tumor were significantly greater in the Death group. Similarly, the T, N, and M stages of NPC were significantly greater in the Death group. In addition, there was no significant difference in the pathological type between the two groups of patients. The expression levels of the hub genes were subsequently validated in clinical samples. Immunofluorescence staining revealed that the protein expression levels of AURKA, BUB1, and CDK1 were significantly upregulated in the NPC tissues of patients in the Death group, whereas the expression levels of CCNA2 and CCNB2 did not significantly differ between the NPC tissues from Control group and Death group (Fig. 2A–E). Thereafter, the relationships between these hub genes and the overall survival rate were determined. The results of the log-rank test suggested that the overall survival rate of patients with high expression levels of AURKA, BUB1, or CDK1 was significantly lower than that of patients with low expression levels of AURKA, BUB1, or CDK1, while the expression level of CCNA2 or CCNB2 was not significantly correlated with the overall survival rate (Fig. 2F–J). These results indicated that the expression levels of AURKA, BUB1, and CDK1 might be predictive factors for the survival of patients with NPC.

Table 1 Baseline data of patients.

Full size table

Construction and evaluation of the survival prediction model based on the hub genes

We constructed a survival prediction model based on the hub genes. Age and gender were included in the prediction model as variables. The other variables were screened by univariate Cox regression analysis with a threshold of p value < 0.15 (Supplementary Table 1). Variables that were inconsistent with clinical experience were excluded (Supplementary Table 1). Finally, gender, age, T stage, N stage, M stage, BUB1 expression, and AURKA expression were used to construct the survival prediction model (Model 1; Table 2). The receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) of Model 1 were calculated, and the results revealed that the areas under the curve (AUC) values for predicting survival at 1500, 2000, and 3000 days were 0.832, 0.927, and 0.939, respectively, exhibiting good predictive ability (Fig. 3A). To further explore whether the inclusion of BUB1 and AURKA improved the performance of the prediction model, we constructed another prediction model (Model 2; Supplementary Table 2) that included gender, age, T stage, N stage, and M stage as variables. The ROC curve and AUC of Model 2 were calculated and are displayed in Fig. 3B, and its AUC values were significantly lower than those of Model 1 (p < 0.001). The calibration plot indicated that Model 1 was better calibrated than Model 2 (Fig. 3C). The net reclassification index (NRI) and integrated discrimination improvement index (IDI) of Model 1 vs. Model 2 were then calculated, and the values of the NRI and IDI were 0.233 and 0.194, respectively, both of which indicated that the inclusion of BUB1 and AURKA significantly improved the performance of the prediction model (Table 3). Finally, the results of decision curve analysis (DCA) revealed that compared with Model 2, Model 1 resulted in greater clinical net benefits (Fig. 3D). In conclusion, the inclusion of BUB1 and AURKA significantly improved the discriminating ability, predictive ability, and clinical utility of the prediction model.

Table 2 Cox survival model for patients with NPC.

Full size table

Table 3 Discriminating and predictive ability of different models.

Full size table

Discussion

NPC is endemic to Southeast Asia and North Africa². The incidence of NPC has remained static in Southern China and poses a serious threat to people’s health⁵. Treatments for NPC have been developed in the past two decades⁷, but there is a lack of biomarkers that can accurately predict the treatment outcome and survival of patients with NPC. In this study, the DEGs of NPC were first analysed using Gene Expression Omnibus (GEO) gene expression datasets. AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were subsequently identified as hub genes by different protein‒protein interaction networks. The expression levels of AURKA, BUB1, and CDK1 were significantly increased in samples from patients who died during follow-up. The overall survival rate of patients with high expression levels of AURKA, BUB1, or CKD1 was significantly lower than that of patients with low expression levels of these genes. Finally, a survival prediction model was constructed with baseline clinical data and hub gene expression levels. The performance of the model was evaluated, and the results revealed that the prediction model has good discriminating ability, predictive ability, and clinical utility.

Infection with EBV is the primary pathogenic factor of NPC²⁰. EBV mainly exists as a latent infection in NPC and expresses viral proteins, including EBNA1, LMP1, and LMP2, which play crucial roles in the tumorigenesis and development of NPC²⁰. Among these viral proteins, LMP1 is regarded as one of the most important oncogenic proteins²¹. LMP1 can simulate the function of tumor necrosis factor receptor (TNFR) and activate the NF-κB, ERK/MAPK, JNK, JAK-STAT, p38/MAPK, and PI3K/Akt pathways, all of which promote tumor cell survival and proliferation²¹. In addition, LMP1 can regulate the expression of proinflammatory factors such as interleukin-6 (IL-6), IL-8, and macrophage inflammatory protein 1-α²². These proinflammatory factors can recruit T cells and macrophages and significantly affect the tumor microenvironment²². Moreover, proinflammatory factors can also induce the growth, migration and invasion of tumor cells²². In addition, LMP1 participates in the reprogramming of glycolysis to provide enough energy for the proliferation of tumor cells²³. Infection with EBV is also related to anoikis resistance and immune evasion^11,23. EBNA1 is expressed in most EBV-associated tumor and significantly contributes to the maintenance, replication, and transcription of the viral genome¹¹. In this study, the DEGs of NPC were found to be significantly enriched in biological processes such as DNA replication and the cell cycle. These results further support the hypothesis that EBV infection induces the expression of virus-related oncogenes, which lead to the progression and development of NPC.

Since the DEGs of NPC were significantly related to DNA replication and the cell cycle of tumor cells, the authors speculated that the hub genes of the DEGs might be potential prognostic biomarkers for patients with NPC. AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were identified as hub genes, and all the hub genes were upregulated in tumor groups. AURKA and AURKB are members of the serine/threonine kinase family and share a highly conserved catalytic domain containing autophosphorylation sites²⁴. Moreover, both AURKA and AURKB play crucial roles in the cell cycle^24,25. According to the Cancer Genome Atlas (TCGA) UALCAN database, AURKA is expressed in many kinds of tumor, such as rectal adenocarcinoma²⁴. AURKA expression is significantly upregulated in bladder urothelial carcinoma, invasive breast carcinoma, cholangiocarcinoma, and colon adenocarcinoma tissues compared with corresponding normal tissues²⁴. AURKB is also significantly upregulated in a variety of tumor²⁴. The function of AURKB is similar to that of AURKA²⁴. Therefore, the authors focused only on AURKA for further exploration in this study. BUB1 and its paralogous homologue BUB1B are members of the spindle assembly checkpoint (SAC) protein family, both of which can prevent premature mitotic chromosome segregation and reduce aneuploidy²⁶. The interaction between BUB1 and BUB1B is mediated by a conserved N-terminal region. This interaction is important for the localization of the mitotic checkpoint kinetochore²⁶. Mutations in the BUB1 and BUB1B genes have been identified in tumor²⁷. The upregulation of BUB1 can induce the proliferation and invasion of gastric tumor cells via the Wnt/β-catenin signalling pathway, whereas the downregulation of BUB1 can lead to S-phase arrest in liver tumor cells^28,29. Similarly, the upregulation of BUB1B was related to the proliferation of myeloma cells via the CDC20/CCNB signalling pathway³⁰. Considering the similarity in the functions of BUB1 and BUB1B, the authors selected BUB1 for further evaluation. CCNA2 is a member of the Cyclin A family and participates in cell cycle regulation³¹. Studies have suggested that CCNA2 is involved in the occurrence and progression of many types of tumors through the induction of epithelial–mesenchymal transformation and metastasis³². CCNB2 is a member of the cell cycle protein family and primarily controls the G2/M phase transition³³. Many studies have shown that CCNB2 is aberrantly expressed in a variety of tumor, including glioblastoma and non-small cell lung cancer^34,35. In addition, the upregulation of CCNB2 was associated with an accelerated proliferation rate of tumor cells³⁵. CDK1 is a member of the cyclin-dependent kinase family and is a serine/threonine kinase that forms a complex with cyclin proteins to regulate the cell cycle³⁶. In addition, CDK1 is the only CDK in mammals that is necessary for the cell cycle and induces G2/M and G1/S transitions and G1 progression³⁶. The dysregulation of CDK1 leads to unrestricted cell proliferation, which ultimately results in the occurrence of tumor³⁶.

Since these hub genes are closely related to the progression of tumor, the authors speculated that these hub genes might have potential predictive value for the survival of patients with NPC. Therefore, the authors further validated the expression levels of the hub genes in patients. The authors collected tumor samples from 120 patients with NPC. The median follow-up time was 2669 days, and 26 patients died during follow-up. The results of multiplex immunofluorescence showed that the expression levels of AURKA, BUB1, and CDK1 were significantly upregulated in the Death group of patients with NPC. Afterwards, the authors constructed a survival prediction model based on hub genes and baseline clinical data. gender, age, T, N, M, BUB1, and AURKA were selected to construct the survival prediction model. As mentioned above, the upregulation of BUB1 contributes to the development of tumor. Zhang et al. reported that BUB1 expression is upregulated in endometrial cancer, is significantly related to the infiltration of T cells in the tumor microenvironment, and is correlated with the prognosis of patients with endometrial cancer³⁷. In addition, Chen et al. reported that BUB1 was significantly correlated with the overall survival rate of patients with breast cancer and might be a prognostic biomarker for patients with breast cancer³⁸. Moreover, a bioinformatics study indicated that BUB1 has potential predictive value for the survival of patients with NPC³⁹. Moreover, studies have indicated that AURKA expression can predict the outcome of patients with breast cancer⁴⁰. Studies have also suggested that AURKA might be a potential prognostic biomarker for NPC⁴¹. In our study, we validated the predictive values of BUB1 and AURKA in a clinical cohort. Finally, the performance of the final prediction model was evaluated using ROC, AUC, calibration plot, and DCA, and the results revealed that the prediction model has good discriminating ability, predictive ability, and clinical utility.

However, there were several limitations in this study. First, owing to the lack of healthy nasopharyngeal tissue samples, the protein expression levels of the hub genes were not compared between healthy nasopharyngeal tissue and NPC tissue. Second, the sample size of the study population was relatively small, and multicentre studies are needed in the further research. In addition, the functions of AURKA and BUB1 in NPC cell lines have not been explored.

Conclusion

In this study, the DEGs of NPC were identified and analysed, and they were found to be significantly enriched in biological processes such as DNA replication. Then, AURKA, AURKB, BUB1, BUB1B, CCNA2, CCNB2, and CDK1 were identified as hub genes. The overall survival rate of patients with high expression levels of AURKA, BUB1, or CDK1 was significantly reduced. Finally, a survival prediction model was constructed with hub genes and clinical data, which had good discriminating ability, predictive ability, and clinical utility.

Methods

Identifying hub genes

The Gene Expression Omnibus database (GEO database, https://www.ncbi.nlm.nih.gov/gds) was searched for gene expression datasets of NPC according to the following criteria: (1) Search term: Nasopharyngeal carcinoma, (2) Top Organisms: Homo sapiens, (3) Study type: Expression profiling by array, (4) Attribute name: Tissue, (5) Sample count: From 6 to 1000, and (6) datasets containing both NPC tissues and normal healthy nasopharyngeal tissues. GSE61218 and GSE126683 met all the criteria. The gene expression datasets of GSE61218 and GSE126683 were subsequently downloaded from the GEO database. The GSE61218 dataset contains six normal healthy nasopharyngeal tissue samples and ten NPC samples. The GSE126683 dataset contains three normal healthy nasopharyngeal tissue samples and three NPC samples. The raw data of these datasets were normalized using the “Lumi” package in R software (R Core Team; R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing; Vienna, Austria; https://www.R-project.org/; version 4.1.2). Afterwards, the data were annotated using the “dplyr” and “limma” packages in R software. The batch effect between each dataset was removed using the “sva” package in R software, and these datasets were merged for further analysis.

The differentially expressed genes (DEGs) were identified using “limma” packages in R software with thresholds of p < 0.05 and |log2fold change| > 1. A protein‒protein interaction network of the DEGs was constructed using the Search Tool for the Retrieval of Interacting Genes online tool (STRING, https://cn.string-db.org/). The minimum required interaction score applied in STRING was medium confidence (0.400). The full STRING network, which included both functional and physical associations, was used in this study. In addition, the active interaction sources included text mining, experiments, databases, coexpression, neighbourhood, gene fusion, and co-occurrence. The protein‒protein interaction network of the DEGs was then visualized using Cytoscape software (version 3.9.1)⁴². The “node’s score” of the network was calculated using cytoHubba (ver. 0.1), which is a plugin of Cytoscape software⁴³. In this study, the top 20 DEGs were identified using the “degree”, “EPC”, “MCC”, and “MNC” methods via cytoHubba. The “degree” method takes the number of directly connected edges of a node as the core indicator to identify key nodes with the most direct connections in the network. The “EPC” method focuses on the anti-interference stability of the network and identifies nodes that can maintain the connectivity of network components. The “MCC” method captures key nodes with both connectivity and local network density advantages by evaluating the central position of nodes in the maximal cliques of the network. The “MNC” method measures the centrality of nodes in their maximum neighbourhood components to explore key nodes that play a leading role in local subnetworks. Finally, the intersection of the four sets of the top 20 genes was taken to obtain the hub genes.

Study population

This study was approved by the Ethics Committee of Xiangya Hospital, Central South University, and was performed in accordance with the Declaration of Helsinki. Informed consent was obtained from all participants. Patients with NPC who underwent radiotherapy at Xiangya Hospital, Central South University, from 2008 to 2013 were included in this study. The exclusion criteria were as follows: age ≤ 18 years or ≥ 80 years, a diagnosis of other tumor, pregnancy, and missing data. NPC samples were obtained by biopsy before treatment. Baseline data and clinical data were recorded. Telephone follow-ups were conducted.

Multiplex immunofluorescence

Formalin-fixed, paraffin-embedded NPC slides were deparaffinized and hydrated. Multiplex immunofluorescence was conducted using an Opal™ 7-Color Manual IHC Kit according to the manufacturer’s recommended procedures. Briefly, the slides were heated in AR buffer using a microwave. After they cooled, the slides were blocked with normal goat serum. The primary antibody was then incubated with the slides overnight. After being rinsed, the slides were incubated with the secondary Polymer HRP Ms + Rb and Opal Fluorophore Working Solution to generate specific Opal signals for each target. After that, the slides were heated in AR buffer using a microwave to strip the primary–secondary–HRP complex, allowing the introduction of the next primary antibody. Then, the same procedure was repeated, starting with the blocking agent and followed by primary antibody incubation and secondary Polymer HRP Ms + Rb and Opal Fluorophore Working Solution incubation to generate specific Opal signal for all targets. Finally, DAPI Working Solution was applied to the slides and the slides were then mounted with mounting medium. Images were obtained via Vectra Quantitative Pathology Imaging Systems, and the fluorescence intensity was measured using ImageJ software. The primary antibodies used in this study were as follows: anti-Aurora A (1:200 dilution; #14475; Cell Signaling Technology), anti-BUB1 (1:200 dilution; #94244; Cell Signaling Technology), anti-CDK1 (1:200 dilution; #9116; Cell Signaling Technology), anti-CCNA2 (1:200 dilution; #67955; Cell Signaling Technology), and anti-CCNB2 (1:200 dilution; #12231; Cell Signaling Technology).

Construction and evaluation of prediction model

Age and gender were included in the prediction model as variables. Univariate Cox regression analysis was then used to screen variables for the predictive model with a p value < 0.15. Multivariate Cox regression analysis was used to construct a prediction model with gender, age, T stage, N stage, M stage, BUB1 expression, and AURKA expression. The discriminating ability of the prediction model was first evaluated by the receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) using the “ROCR” package in R software (version 4.1.2). In addition, the differences between the AUCs of the different models were compared by the Delong method using the “pROC” package in R software (version 4.1.2). Afterwards, a calibration plot with the “boot” method using 1000 replications was applied to evaluate the calibration of the models using the “rms” package in R software (version 4.1.2). The net reclassification index (NRI) and integrated discrimination improvement index (IDI) were further utilized to evaluate the additional predictive ability of the model after the inclusion of the hub genes by using the “nricens” and “PredictABEL” packages in R software (version 4.1.2). Finally, decision curve analysis (DCA) was applied to analyse the clinical utility of the models using the “rmda” package in R software (version 4.1.2).

Statistical analysis

Statistical analysis was performed using SPSS version 19 (IBM Corporation, Armonk, NY, USA), R software (version 4.1.2), and GARPHPAD (version 8.0). Continuous data are expressed as means ± standard deviations (SDs). Count data are expressed as frequencies (percentages). Student’s t test was used to compare continuous data with a normal distribution between different groups, and Mann–Whitney U tests were used to compare continuous data with a nonnormal distribution. For count data, the chi-square test was used to compare the difference in frequencies between groups. Cox regression analysis was used to construct the prediction models. The performance of the models was determined using ROC curves, AUC, NRI, IDI, and DCA. A Kaplan–Meier curve (log-rank test) was used to compare the survival rates between different groups. A value of p < 0.05 was considered to indicate statistical significance.

Data availability

The gene expression datasets of GSE61218 and GSE126683 are available in the GEO repository (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE61218, and https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE126683). For requestion of data from this study, please contact the corresponding author, Yubin Chen.

References

Tang, L. L. et al. Validation of the 8th edition of the UICC/AJCC staging system for nasopharyngeal carcinoma from endemic areas in the intensity-modulated radiotherapy era. J. Natl. Compr. Canc Netw. 15(7), 913–919 (2017).
Article PubMed Google Scholar
Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68(6), 394–424 (2018).
PubMed Google Scholar
Chen, Y. P. et al. Nasopharyngeal carcinoma. Lancet 394(10192), 64–80 (2019).
Article PubMed Google Scholar
Tang, L. L. et al. Global trends in incidence and mortality of nasopharyngeal carcinoma. Cancer Lett. 374(1), 22–30 (2016).
Article ADS CAS PubMed Google Scholar
Wei, K. R. et al. Epidemiological trends of nasopharyngeal carcinoma in China. Asian Pac. J. Cancer Prev. 11(1), 29–32 (2010).
PubMed Google Scholar
Luo, W. Nasopharyngeal carcinoma ecology theory: cancer as multidimensional spatiotemporal unity of ecology and evolution pathological ecosystem. Theranostics 13(5), 1607–1631 (2023).
Article CAS PubMed PubMed Central Google Scholar
Li, W. et al. Immunotherapeutic approaches in EBV-associated nasopharyngeal carcinoma. Front. Immunol. 13, 1079515 (2022).
Article CAS PubMed Google Scholar
Tsang, C. M., Lui, V. W. Y., Bruce, J. P., Pugh, T. J. & Lo, K. W. Translational genomics of nasopharyngeal cancer. Semin Cancer Biol. 61, 84–100 (2020).
Article CAS PubMed Google Scholar
Dai, W. et al. Whole-exome sequencing identifies MST1R as a genetic susceptibility gene in nasopharyngeal carcinoma. Proc. Natl. Acad. Sci. U S A. 113(12), 3317–3322 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, Y. Y. et al. Exome and genome sequencing of nasopharynx cancer identifies NF-κB pathway activating mutations. Nat. Commun. 8, 14121 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Hau, P. M. et al. Targeting Epstein-Barr virus in nasopharyngeal carcinoma. Front. Oncol. 10, 600 (2020).
Article PubMed PubMed Central Google Scholar
Wong, K. C. W. et al. Nasopharyngeal carcinoma: an evolving paradigm. Nat. Rev. Clin. Oncol. 18(11), 679–695 (2021).
Article CAS PubMed Google Scholar
Peng, G. et al. A prospective, randomized study comparing outcomes and toxicities of intensity-modulated radiotherapy vs. conventional two-dimensional radiotherapy for the treatment of nasopharyngeal carcinoma. Radiother Oncol. 104(3), 286–293 (2012).
Article PubMed Google Scholar
Co, J., Mejia, M. B. & Dizon, J. M. Evidence on effectiveness of intensity-modulated radiotherapy versus 2-dimensional radiotherapy in the treatment of nasopharyngeal carcinoma: Meta-analysis and a systematic review of the literature. Head Neck. 38(Suppl 1), E2130–E2142 (2016).
PubMed Google Scholar
Mao, Y. P. et al. Prognostic factors and failure patterns in non-metastatic nasopharyngeal carcinoma after intensity-modulated radiotherapy. Chin. J. Cancer. 35(1), 103 (2016).
Article PubMed PubMed Central Google Scholar
Zhang, B. et al. Intensity-modulated radiation therapy versus 2D-RT or 3D-CRT for the treatment of nasopharyngeal carcinoma: A systematic review and meta-analysis. Oral Oncol. 51(11), 1041–1046 (2015).
Article PubMed Google Scholar
Colevas, A. D. et al. NCCN Guidelines^® insights: head and neck cancers, version 2.2025. J. Natl. Compr. Canc Netw. 23(2), 2–11 (2025).
Article CAS PubMed Google Scholar
Kanehisa, M., Furumichi, M., Sato, Y., Matsuura, Y. & Ishiguro-Watanabe, M. KEGG: biological systems database as a model of the real world. Nucleic Acids Res. 53(D1), D672–d7 (2025).
Article CAS PubMed PubMed Central Google Scholar
Kanehisa, M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 28(11), 1947–1951 (2019).
Article CAS PubMed PubMed Central Google Scholar
Young, L. S., Yap, L. F. & Murray, P. G. Epstein-Barr virus: more than 50 years old and still providing surprises. Nat. Rev. Cancer. 16(12), 789–802 (2016).
Article CAS PubMed Google Scholar
Shair, K. H. Y., Reddy, A. & Cooper, V. S. New insights from elucidating the role of LMP1 in nasopharyngeal carcinoma. Cancers. ;10(4) (2018).
Yi, M. et al. Rediscovery of NF-κB signaling in nasopharyngeal carcinoma: how genetic defects of NF-κB pathway interplay with EBV in driving oncogenesis? J. Cell. Physiol. 233(8), 5537–5549 (2018).
Article CAS PubMed Google Scholar
Lo, A. K., Dawson, C. W., Lung, H. L., Wong, K. L. & Young, L. S. The role of EBV-encoded LMP1 in the NPC tumor microenvironment: from function to therapy. Front. Oncol. 11, 640207 (2021).
Article CAS PubMed PubMed Central Google Scholar
Du, R., Huang, C., Liu, K., Li, X. & Dong, Z. Targeting AURKA in cancer: molecular mechanisms and opportunities for cancer therapy. Mol. Cancer. 20(1), 15 (2021).
Article CAS PubMed PubMed Central Google Scholar
Otto, T. & Sicinski, P. Cell cycle proteins as promising targets in cancer therapy. Nat. Rev. Cancer. 17(2), 93–115 (2017).
Article CAS PubMed PubMed Central Google Scholar
Skowyra, A., Allan, L. A., Saurin, A. T. & Clarke, P. R. USP9X limits mitotic checkpoint complex turnover to strengthen the spindle assembly checkpoint and guard against chromosomal instability. Cell. Rep. 23(3), 852–865 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kim, T. & Gartner, A. Bub1 kinase in the regulation of mitosis. Anim. Cells Syst. (Seoul). 25(1), 1–10 (2021).
Article PubMed PubMed Central Google Scholar
Grabsch, H. et al. Overexpression of the mitotic checkpoint genes BUB1, BUBR1, and BUB3 in gastric cancer–association with tumour cell proliferation. J. Pathol. 200(1), 16–22 (2003).
Article MathSciNet CAS PubMed Google Scholar
Qiu, J. et al. BUB1B promotes hepatocellular carcinoma progression via activation of the mTORC1 signaling pathway. Cancer Med. 9(21), 8159–8172 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhou, X. et al. BUB1B (BUB1 Mitotic Checkpoint Serine/Threonine Kinase B) promotes lung adenocarcinoma by interacting with zinc finger protein ZNF143 and regulating Glycolysis. Bioengineered 13(2), 2471–2485 (2022).
Article CAS PubMed PubMed Central Google Scholar
He, Q. et al. Smoking-induced CCNA2 expression promotes lung adenocarcinoma tumorigenesis by boosting AT2/AT2-like cell differentiation. Cancer Lett. 592, 216922 (2024).
Article CAS PubMed Google Scholar
Ershov, P., Poyarkov, S., Konstantinova, Y., Veselovsky, E. & Makarova, A. Transcriptomic signatures in colorectal cancer progression. Curr. Mol. Med. 23(3), 239–249 (2023).
Article CAS PubMed Google Scholar
Hu, M. et al. Knockdown of CCNB2 inhibits the tumorigenesis of gastric cancer by regulation of the PI3K/Akt pathway. Sci. Rep. 15(1), 5703 (2025).
Article ADS PubMed PubMed Central Google Scholar
Takashima, S. et al. Strong expression of cyclin B2 mRNA correlates with a poor prognosis in patients with non-small cell lung cancer. Tumour Biol. 35(5), 4257–4265 (2014).
Article CAS PubMed Google Scholar
Wang, D. et al. CCNB2 is a novel prognostic factor and a potential therapeutic target in low-grade glioma. Biosci. Rep. ;42(1). (2022).
Wang, Q., Bode, A. M. & Zhang, T. Targeting CDK1 in cancer: mechanisms and implications. NPJ Precis Oncol. 7(1), 58 (2023).
Article CAS PubMed PubMed Central Google Scholar
Zhang, H., Li, Y. & Lu, H. Correlation of BUB1 and BUB1B with the development and prognosis of endometrial cancer. Sci. Rep. 14(1), 17084 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Chen, D. L., Cai, J. H. & Wang, C. C. N. Identification of key prognostic genes of triple negative breast cancer by LASSO-Based machine learning and bioinformatics analysis. Genes. 13(5) (2022).
Liu, K., Kang, M., Zhou, Z., Qin, W. & Wang, R. Bioinformatics analysis identifies hub genes and pathways in nasopharyngeal carcinoma. Oncol. Lett. 18(4), 3637–3645 (2019).
CAS PubMed PubMed Central Google Scholar
Kahl, I. et al. The cell cycle-related genes RHAMM, AURKA, TPX2, PLK1, and PLK4 are associated with the poor prognosis of breast cancer patients. J. Cell. Biochem. 123(3), 581–600 (2022).
Article CAS PubMed Google Scholar
Jiang, D. et al. AURKA, as a potential prognostic biomarker, regulates autophagy and immune infiltration in nasopharyngeal carcinoma. Immunobiology 228(2), 152314 (2023).
Article CAS PubMed Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13(11), 2498–2504 (2003).
Article CAS PubMed PubMed Central Google Scholar
Chin, C. H. et al. CytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 8(Suppl 4), S11 (2014).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors thank Yangjie Zhou from Xiangya Hospital, Central South University for the helping and advice during conducting this study.

Funding

This study was funded by the National Natural Science Foundation of China (82370642 and 82501919) and Natural Science Foundation of Hunan (2024JJ5611 and 2025JJ60533).

Author information

Jinfei Zhu and Yizhuo Feng contributed equally to this work.

Authors and Affiliations

Department of Cardiology, Xiangya Hospital, Central South University, Changsha, Hunan, China
Jinfei Zhu
Department of Endocrinology, Xiangya Hospital, Central South University, Changsha, Hunan, China
Yizhuo Feng & Can-e Tang
The Institute of Medical Science Research, Xiangya Hospital, Central South University, Changsha, Hunan, China
Zhanwei Zhu & Can-e Tang
Department of Cardiovascular Surgery, Xiangya Hospital, Central South University, Changsha, Hunan, China
Yubin Chen
National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
Can-e Tang

Authors

Jinfei Zhu
View author publications
Search author on:PubMed Google Scholar
Yizhuo Feng
View author publications
Search author on:PubMed Google Scholar
Zhanwei Zhu
View author publications
Search author on:PubMed Google Scholar
Yubin Chen
View author publications
Search author on:PubMed Google Scholar
Can-e Tang
View author publications
Search author on:PubMed Google Scholar

Contributions

JFZ and YZF conducted experiments, processing data analysis, completing figures and writing the manuscript. ZWZ conducted the bioinformatics analysis. YBC and CET designed the study and reviewed the manuscript.

Corresponding authors

Correspondence to Yubin Chen or Can-e Tang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1 (download DOCX )

Supplementary Material 2 (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhu, J., Feng, Y., Zhu, Z. et al. Identification of hub genes and construction of a survival prediction model for patients with nasopharyngeal carcinoma. Sci Rep 16, 5299 (2026). https://doi.org/10.1038/s41598-026-36395-4

Download citation

Received: 19 November 2025
Accepted: 12 January 2026
Published: 16 January 2026
Version of record: 06 February 2026
DOI: https://doi.org/10.1038/s41598-026-36395-4

Subjects

Abstract

Similar content being viewed by others

BUB1 promotes cell stem-like properties and serves as a diagnostic biomarker for lung cancer

Screening of co-expressed genes in hypopharyngeal carcinoma with esophageal carcinoma based on RNA sequencing and Clinical Research

Dysregulated expression of cell cycle regulators CDC20, PLK1, BUB1, CDC45, CDCA5 in pancreatic ductal adenocarcinoma

Introduction

Results

Identification of hub genes of NPC

High expression levels of hub genes were significantly associated with a low overall survival rate

Construction and evaluation of the survival prediction model based on the hub genes

Discussion

Conclusion

Methods

Identifying hub genes

Study population

Multiplex immunofluorescence

Construction and evaluation of prediction model

Statistical analysis

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Supplementary Information

Supplementary Material 1 (download DOCX )

Supplementary Material 2 (download DOCX )

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links