Introduction

Cancer is one of the leading causes of death worldwide, which possesses low treatment success rates in both developed and developing countries1. Moreover, the incidence of cancer has been escalating, leading to a noteworthy economic burden and a challenge for health policy2. In spite of the latest improvements in cancer detection and therapeutic techniques, cancer continues to pose a severe risk to public health3. Therefore, to enhance cancer diagnosis and treatment, it is crucial to elucidate the underlying mechanisms of cancer development and to identify biomarkers for targeted therapy. Recent studies have shown that cancer is a complex disease that arises from various genetic and epigenetic alterations and environmental factors. These alterations lead to abnormal cellular proliferation, resistance to apoptosis, and invasion of nearby tissues, ultimately leading to cancer. However, the precise mechanisms underlying these alterations remain unclear, and identifying effective biomarkers for early diagnosis and personalized treatment is challenging.

DNA topoisomerases are enzymes that regulate and modify the topological structure of DNA during transcription4. They are classified into two types based on their ability to create short single-stranded or double-stranded breaks in DNA: type I and type II, which consisting of a family of six enzymes4,5,6. DNA topoisomerase IIα (TOP2A) is a vital isoform of type II topoisomerases that plays a crucial role in various cellular activities7. Encoded by the TOP2 gene, TOP2A primarily operates by cleaving and rejoining DNA strands, thereby facilitating mitosis, regulating DNA topology, and controlling DNA supercoiling7,8. Its unique ability to temporarily break both strands of DNA results in the formation of double-strand breaks (DSBs), which are critical for eliminating overwinding and relieving torsional stress9. Its functions are not only limited to DNA replication but also extend to DNA repair, recombination, and transcriptional regulation10. The expression of TOP2A is elevated in proliferating cells to ensure the fidelity of genome replication, particularly during the S phase when DNA replication occurs11. In addition, it serves to release super-helical stress that arises from DNA replication or transcription by unwinding via a formed cleavage complex and religating DNA strands. TOP2A is a critical guardian for the stable expression of the genome, with dysregulation of TOP2A implicated in a range of developmental and neurological disorders, as well as cancer, due to its ability to solve diverse topological problems12,13,14.

Moreover, TOP2A has been shown to modulate the expression of various genes and influence chromatin organization. These diverse functions make TOP2A a crucial enzyme for maintaining normal cellular processes, and any aberrations in its expression or activity can have severe consequences on cellular processes and lead to various diseases10. Emerging evidence highlights a significant interplay between TOP2A and DNA methylation. DNA methylation, an essential epigenetic modification, regulates gene expression by adding methyl groups to CpG islands in promoter regions. Aberrant DNA methylation patterns, such as hypermethylation or hypomethylation, have been shown to impact TOP2A expression and activity, contributing to cancer progression and poor patient outcomes. Specifically, DNA hypermethylation at regulatory regions of cancer suppressor genes can lead to their silencing, indirectly amplifying the oncogenic role of TOP2A. Conversely, hypomethylation in the promoters of TOP2A-related genes may enhance its overexpression, fostering uncontrolled proliferation and metastasis.

Consequently, TOP2A dysregulation has been implicated in a multitude of cancers. The anti-cancer effects of TOP2A inhibitors, such as doxorubicin, teniposide, idarubicin, and mitoxantrone, are rooted in their ability to target these mechanisms15,16,17,18. In this study, we performed a comprehensive analysis of TOP2A utilizing pan-cancer multi-omics data from The Cancer Genome Atlas (TCGA). Our findings revealed that genomic alterations and epigenetic modifications of TOP2A drive its aberrant expression, influencing the prognosis of cancer patients. Additionally, TOP2A expression levels were significantly correlated with molecular subtypes, clinicopathologic stages, immune subtypes, regulation of cancer-related pathways, and prognostic outcomes across various cancers. To further investigate these associations, we performed a comprehensive pan-cancer analysis using publicly available datasets, evaluating differential expression patterns and long-term survival outcomes.

Materials and methods

Data sources

To investigate the gene expression of TOP2A in various normal tissues from healthy individuals, we downloaded TPM (Transcripts Per Million)-normalized RNAseq data (data release version 7.0) from the Genotype-Tissue Expression (GTEx) portal (https://commonfund.nih.gov/GTEx/). The dataset consisted of 11,688 samples representing the expression profiles of 56,202 genes across 30 organs (53 tissues) from 714 healthy individuals. For the multi-omics pan-cancer analysis, datasets were obtained from The Cancer Genome Atlas (TCGA) (https://portal.gdc.cancer.gov/), which included mRNA sequencing level 3 data (n = 10,995), clinical data (n = 11,160), Illumina HumanMethylation 450k level 3 data (n = 10,129), single nucleotide variant (SNV) data (n = 10,234), and copy number variation (CNV) data (n = 11,495).

Our study comprised normal tissues from 30 organs in the GTEx dataset and 33 cancer types from the TCGA dataset. The normal tissues included: adipose tissue, adrenal gland, bladder, blood, blood vessel, brain, breast, cervix uteri, colon, esophagus, fallopian tube, heart, kidney, liver, lung, muscle, nerve, ovary, pancreas, pituitary, prostate, salivary gland, skin, small intestine, spleen, stomach, testis, thyroid, uterus, and vagina. The cancer types analyzed were: adrenocortical carcinoma (ACC), bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), colon adenocarcinoma (COAD), glioblastoma multiforme (GBM), acute myeloid leukemia (LAML), head and neck squamous cell carcinoma (HNSC), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), cholangiocarcinoma (CHOL), mesothelioma (MESO), ovarian serous cystadenocarcinoma (OV), pancreatic adenocarcinoma (PAAD), pheochromocytoma and paraganglioma (PCPG), thyroid carcinoma (THCA), prostate adenocarcinoma (PRAD), esophageal carcinoma (ESCA), rectum adenocarcinoma (READ), sarcoma (SARC), skin cutaneous melanoma (SKCM), stomach adenocarcinoma (STAD), testicular germ cell tumors (TGCT), thymoma (THYM), uterine carcinosarcoma (UCS), uterine corpus endometrial carcinoma (UCEC), and uveal melanoma (UVM). We standardized the data for both by converting them uniformly to TPM (Transcripts Per Million) or RPKM (Reads Per Kilobase of transcript per Million mapped reads) units. The batch effect of both can be removed using ComBat.

Difference expression analysis of TOP2A

ONCOMINE database (http://www.oncomine.org) is one of the largest oncogene microarray databases and comprehensive data mining platforms in the world, which integrates RNA and DNA-seq data from GEO, TCGA and published sources.In the ONCOMINE database, we analyzed TOP2A mRNA expression in different types of cancer based on a P-value of 0.001 and a fold change of 1.519. We compared the expression levels between cancer samples and their matched normal samples, as well as with data from GTEx, for the 33 cancer types. To assess expression differences, we set the following threshold values: a fold change (FC) > 2 and an adjusted P-value (FDR) < 0.05. If a particular cancer type did not meet these criteria, it was excluded from the study.

Survival prognosis of TOP2A

For survival analyses, we employed the Kaplan–Meier method and log-rank test in each cancer type. We used the R packages “survival” and “survminer” to generate survival curves. To investigate the prognostic significance of TOP2A in pan-cancer, we performed univariate Cox regression analysis. We selected overall survival (OS), progression-free survival (PFS), disease-free survival (DFS), and disease-specific survival (DSS) to examine the association between TOP2A expression and prognosis. We considered a P-value ≤ 0.05 in the Kaplan-Meier log-rank test to be statistically significant.

Methylation analysis

We performed differential methylation analysis on 33 cancer types. The P-value was calculated using a t-test and was adjusted for FDR. We considered genes with an FDR < 0.05 to have significant methylation differences. We performed an integrative analysis of methylation and mRNA expression data using TCGA barcodes to examine the correlation between methylation levels and corresponding mRNA expression. Spearman correlation analysis was applied to assess the relationship between the paired mRNA expression and methylation levels. P-values were adjusted for false discovery rate (FDR), and genes with an FDR < 0.05 were considered significantly influenced by genome methylation.

CNV analysis

To perform CNV analysis, we calculated the percentage of CNV of TOP2A in each cancer type. The CNV was categorized into heterozygous CNV and homozygous CNV, which included amplification and deletion. Heterozygous CNV indicates the occurrence of CNV on only one chromosome, while homozygous CNV indicates the occurrence of CNV on both chromosomes. The percentage of CNV subtypes was based on CNV data processed through GISTIC 2.0. We integrated mRNA expression data and CNV raw data using the TCGA barcode for correlation analysis. We calculated the association between matched mRNA expression and CNV percent samples based on Pearson’s product-moment correlation coefficient and t-distribution. The P-value was adjusted for FDR.

SNP analysis

An SNV analysis identified seven types of harmful mutations: missense mutations, nonsense mutations, frame shift inserts, splice sites, frame shift deletions, and frame shift inserts. A waterfall diagram and a summary diagram were generated using the R package maftools. Using the formula: Number of Mutated Samples/Number of Cancer Samples, we calculated the SNV percentage (frequency of deleterious mutations) of each gene. We integrated SNV data with clinical survival data by matching TCGA sample barcodes.

Correlation analysis of TOP2A and cancer markers

We conducted a correlation analysis between risk score and cancer markers. Stem cell indices based on the transcriptome of each TCGA HCC sample were downloaded from the UCSC Xena browser and labelled as RNAss. We used the Infer Heterogeneity function of the Maftools package to calculate the Mutant-Allele Tumor Heterogeneity (MATH) score. Neoantigen, ploidy, homologous recombination deficiency (HRD), and loss of heterozygosity (LOH) data were obtained from a previous study20. Microsatellite instability (MSI) is a form of genomic instability that arises from defects in the DNA mismatch repair (MMR) system. MSI has been implicated in the development and progression of various types of cancer. We integrated these data with TOP2A in pan-cancer.

Immune subtype analysis and pathway activity analysis

Immune subtype data were obtained from the UCSC Xena Browser (http://xena.ucsc.edu/). Based on previous studies, we divided TCGA patients into 6 immune groups21. To perform pathway activity analysis, we obtained relative protein levels from TCPA RPPA data, which were median-centered and normalized by standard deviation across all samples for each component. The pathway score was then calculated as the sum of the relative protein levels of all positive regulatory components minus that of negative regulatory components in a particular pathway. We calculated the pathway activity score (PAS) for 10 well-known cancer-related pathways (RAS/MAPK, Hormone ER, PI3K/AKT, Hormone AR, EMT, TSC/mTOR, RTK, DNA Damage Response, Cell Cycle, and Apoptosis pathways). Tumor samples were divided into two groups (high and low) based on median mRNA expression, and we determined the PAS difference between the two groups using the Student t-test. If PAS (gene X high group) > (gene X low group), gene X was considered to activate the pathway; otherwise, it would inhibit the pathway. A PAS with FDR < 0.05 indicated a significant effect on the pathway.

Plasmid transfection and Flow cytometric analysis

HepG2 cells were purchased from ATCC (American Type Culture Collection). In this passage, the lentiviral vectors pGCSIL-shRNA-TOP2A and pGC-FU-TOP2A cDNA were used to knockdown TOP2A gene expression. The shRNA sequence used for knockdown was 5ʹ-ATCCTGCAG-GAATGGCATT-3ʹ, and a non-silenced shRNA sequence was used as a negative control (5ʹ-TTCTCCGAACGTGTCACGT-3ʹ). After transfection, cells were plated in 12-well plates and cultured for 24 h. The cells were then trypsinized and collected by centrifugation. The permeabilization and DNA staining solutions were added to the cells according to the kit instructions and the stained cells were sorted by FACS using emission and excitation wavelengths of 615 and 535 nm, respectively. The percentages of cells in each cell cycle phase were analyzed using ModFit LT™ 3.1 software. The experiment was performed in triplicate.

Statistical analysis

The gene expression data were all normalized by log2 transformation. Normal tissue and cancer tissue were compared by t-test; P < 0.05 indicated statistical significance. Throughout this study, Kaplan-Meier curves, log-rank tests, and Cox proportional hazards regression models were used. In the correlation analysis between the two variables, Spearman’s or Pearson’s tests were used; significant values were defined as P < 0.05. R (Version 4.0.2) was used for all statistical analyses.

Results

Pan-cancer expression landscape of TOP2A

To investigate the role of TOP2A across various cancers, we conducted an analysis of TOP2A expression. Initially, we used ONCOMINE to examine TOP2A mRNA expression levels across different cancer types. Our analysis revealed that, compared to normal tissues, TOP2A expression was significantly elevated in several cancer types, including bladder, brain and CNS, cervical, colorectal, esophageal, gastric, head and neck, kidney, liver, lung, ovarian, prostate, sarcoma, leukemia, and lymphoma. Conversely, lower TOP2A expression was observed in leukemia, melanoma, myeloma, pancreatic, and breast cancer datasets (as depicted in Fig. 1A). In addition, using the TCGA dataset, we found that TOP2A was upregulated in several cancer types, such as BLCA, BRCA, CHOL, COAD, ESCA, HNSC, KIRC, KIRP, LIHC, LUAD, LUSC, PRAD, READ, STAD, THCA, and UCEC, when compared to adjacent normal tissues (as shown in Fig. 1B). Furthermore, in the GTEx and TCGA datasets, TOP2A mRNA expression was upregulated in many cancer types, including ACC, BLCA, BRCA, CESC, CHOL, COAD, DLBC, ESCA, GBM, HNSC, KICH, KIRC, KIRP, LAML, LGG, LIHC, LUAD, LUSC, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, THCA, THYM, UCEC, and UCS (as represented in Fig. 1C).

Fig. 1
figure 1

TOP2A expression levels in cancers. (A) The expression of TOP2A was analyzed in various cancer tissues using ONCOMINE, and compared with its expression in normal tissues. The number in each cell of the analysis represents the amount of datasets used in the study. (B) A TCGA dataset was used to compare the levels of TOP2A in paired tumor and normal tissues across different cancers. (C) The comparison of TOP2A levels between tumor and normal tissues revealed distinct patterns across.

Impact of TOP2A on Pan-cancer Prognosis

We aimed to evaluate the impact of TOP2A expression on cancer prognosis across multiple cancer types. Our analysis indicated that increased TOP2A expression was significantly associated with poor overall survival (OS), progression-free interval (PFI), disease-specific survival (DSS), and disease-free interval (DFI) in all cancers (P < 0.001), as presented in Fig. 2A–D. Additionally, we observed a positive relationship between TOP2A expression and OS in several cancer types, such as UVM, PCPG, PAAD, MESO, LUAD, LIHC, LGG, KIRP, KIRC, KICH, while a negative correlation was identified in THYM and CESC, as illustrated in the forest plots (Fig. 2E). Similarly, a positive relationship between PFI and TOP2A expression was evident in UVM, THCA, SARC, PRAD, PCPG, PAAD, MESO, LUAD, LIHC, LGG, KIRP, KIRC, KICH, as presented in the forest plots (Fig. 2F). Furthermore, we observed a positive relationship between DSS and TOP2A expression in UVM, PCPG, PAAD, MESO, LUAD, LIHC, LGG, KIRP, KIRC, KICH (Fig. 2G). Finally, a positive relationship was identified between DFI and TOP2A expression in THCA, SARC, PRAD, PAAD, LIHC, and KIRP, as shown in the forest plots (Fig. 2H).

Fig. 2
figure 2

Impact of TOP2A on Pan-Cancer Prognosis. (AD) The analysis showed that in all cancers, increased expression of TOP2A was significantly associated with poor overall survival (OS), progression-free interval (PFI), disease-specific survival (DSS), and disease-free interval (DFI) (P < 0.001). (EH) The univariate Cox regression analyses for TOP2A expression in pan-cancer revealed a positive association between TOP2A expression and OS, disease-free survival (DFS), DSS, and progression-free survival (PFS) (as depicted in the forest plot). A hazard ratio (HR) value greater than 1 indicated a risk factor, whereas an HR value less than 1 indicated a favorable factor.

Correlation of TOP2A expression with DNA methylation

DNA methylation is a critical epigenetic mechanism that regulates gene expression by adding a methyl group to DNA, thereby influencing gene accessibility and activity. This modification influences gene readability, thereby affecting cell function and development. To further explore the role of DNA methylation in regulating TOP2A across various cancers, we performed a detailed analysis. Our findings indicate that in BLCA, BRCA, LIHC, LUAD, PRAD, and UCEC, the Beta-value of TOP2A was significantly lower than that in the normal group (P < 0.05). Conversely, in HNSC and THCA, the Beta-value of TOP2A was significantly higher than that in the normal group (P < 0.05) (illustrated in Fig. 3A). Additionally, we observed negative correlations between TOP2A expression and promoter methylation levels in almost all cancers, with the five strongest positive correlations found in OV, GBM, UCS, DLBC, and STAD (as shown in Fig. 3B). From results above, we know that hypermethylation of the TOP2A promoter region may suppress its expression in normal tissues, while hypomethylation in cancer cells could contribute to the overexpression of TOP2A, which has been previously linked to cancer progression and poor prognosis in various cancers. This dysregulation of methylation patterns could lead to uncontrolled cell division and genome instability, both of which are hallmark features of cancer.

Fig. 3
figure 3

Correlation of TOP2A Expression With DNA Methylation. (A) Comparison of TOP2A DNA methylation levels in tumor and normal groups in different cancer types. (B) Correlation between DNA methylation and TOP2A in different cancer types.

Correlation of TOP2A expression with CNV

Gene copy number variation (CNV) is a type of genomic structural variation in which the number of copies of a gene or genes differs between individuals compared to the normal population. CNV can affect gene expression and function, which in turn impacts cell growth, differentiation, and development. In Fig. 4A, different levels of CNV expression of TOP2A are shown across various cancer types. For example, in KICH, all samples had total Hete dele, while in KIRP, all samples had total Hete amp. To investigate the association between TOP2A CNV and mRNA expression in pan-cancer, a Spearman analysis was conducted. The results showed a significant positive correlation between TOP2A CNV and mRNA expression in SKCM, HNSC, LIHC, PCPG, GBM, BLCA, CESC, UCEC, BRCA, OV, COAD, TGCT, LUAD, STAD, ESCA, CHOL, LUSC, and UCS, as presented in Fig. 4B.

Fig. 4
figure 4

Correlation of TOP2A expression With CNV. (A) Different levels of CNV expression of TOP2A are shown across various cancer types. (B) Correlation between CNV and TOP2A in different cancer types.

SNV, TMB and neoantigen analysis of TOP2A in pan-cancer

SNV (Single Nucleotide Variation) is the variation of a single nucleotide in the genome, which is often caused by environmental, genetic or other unknown factors. TMB (Tumor Mutation Load) is the number of all types of mutations present in cancer cells, including SNV, small insertions and small deletions. TMB is often considered as one of the indicators to assess the response to cancer immunotherapy. Neoantigens are peptides resulting from SNVs and other mutation types that are recognized as exogenous antigens and elicit a response from the immune system. To investigate the SNVs of TOP2A, we used lollipop diagrams to show TOP2A Missense_Mutation, In_Frame_Ins, Frame_Shift_Ins, Frame_Shift_Del, Nonsense_Mutation, Splice_Site, Translation_Start_Site and In_Frame_Del in different cancer types (Fig. 5A). Subsequently we investigated whether there were correlations between TOP2A expression levels and TMB and neoantigen, which both have essential connections with the sensitivity of immune checkpoint inhibitors. In ACC, KICH, LGG, LAML, KIRC, CHOL, SKCM, STAD, BRCA, SARC, PAAD, BLCA, UCEC, PRAD, LUSC and LUAD, the TOP2A expression was related to TMB (P < 0.05) (Fig. 5B). In LGG, HNSC, STAD, UCEC, BRCA, LUSC, OV and LUAD, the TOP2A expression was related to neoantigen (P < 0.05) (Fig. 5C).

Fig. 5
figure 5

Single nucleotide variation (SNV) analysis of TOP2A in pan-cancer. (A) Lollipop diagrams of TOP2A Missense_Mutation, In_Frame_Ins, Frame_Shift_Ins, Frame_Shift_Del, Nonsense_Mutation, Splice_Site, Translation_Start_Site and In_Frame_Del in different cancer types. (B) Radar chart displays the overlap between TOP2A and TMB. (C) Radar chart displays the overlap between TOP2A and neoantigen.

Correlation between TOP2A and HRD, LOH, MATH, MSI, RNAss and ploidy in pan-cancer

HRD (Homologous Recombination Deficiency) indicates a defect in the genome that causes cells to fail to properly repair DNA double-strand breaks. HRD is commonly associated with tumorigenesis and progression, especially for cancer types with familial inherited risk such as breast and ovarian cancer. In clinical practice, HRD testing can provide important information for individualized cancer treatment, such as selecting the appropriate PARP inhibitor therapy. LOH (Loss of Heterozygosity) indicates the loss of one of the two alleles of a gene, usually associated with a cancer; LOH testing can provide information about the loss of chromosomal regions that may contain cancer-related genes. MATH (Mutant-Allele Tumor Heterogeneity) is a calculation to determine the heterogeneity of mutations in a tumor. Higher MATH scores indicate greater mutational heterogeneity in a tumor and are usually associated with poor prognosis of the tumor. MSI (Microsatellite Instability) indicates instability of microsatellite regions in the genome, usually caused by defects in DNA repair mechanisms. MSI is common in certain cancer types, such as colorectal cancer, and is often associated with cancer response to immunotherapy. RNAss (RNA-sequencing-based subtype classification) is an RNA sequencing analysis method used to identify molecular subtypes of cancer. RNAss can identify different molecular subtypes in different cancer types, and these molecular subtypes are often associated with different prognosis and treatment response of cancers. Ploidy indicates the ploidy of chromosomes in a cell or tissue and is usually determined by counting the number of chromosomes and gene copy number. ploidy assays can provide information about cancer chromosomal abnormalities and gene copy number variants that are important for individualized cancer treatment decisions. To this end, we investigated the relationship between TOP2A and these indicators aforementioned in pan-cancers. In some cancer types, they showed positive correlation and in others, they showed negative correlation (Fig. 6A–F).

Fig. 6
figure 6

Correlation between TOP2A and HRD, LOH, MATH, MSI, RNAss and ploidy in pan-cancer. Correlation between TOP2A and HRD (A), LOH (B), MATH (C), MSI (D), RNAss (E) and ploidy (F) in pan-cancer.

Immune analysis of TOP2A in pan-cancer

The immune system is of great importance in cancer therapy. The proliferation and spread of cancer cells may be monitored and inhibited by the immune system, but they can also escape from immune surveillance through immune escape mechanisms. In immunotherapy, by activating the immune system, the attack and clearance of cancer cells can be enhanced, thus inhibiting the growth and spread of cancer. Thus, it is important to explore the relationship between TOP2A and immunity. TOP2A is significantly differentially expressed in 6 different immune subgroups (Fig. 7A). ImmuneScore, StromaScore and ESTIMATE Score are three computational methods for assessing cancer immune status and cancer microenvironment. We evaluated the relationship of TOP2A with these three scores in pan-cancer and found that the relationship between TOP2A and these three scores showed different trends in different cancer species (Fig. 7B). In addition, we performed analysis of TOP2A and immune cells in pan-cancer. We found that T helper cells were positively correlated with TOP2A in almost all cancers, and T helper 2 cells in particular, but Mast cells, Macrophages, Lymph vessels, iDC, Eosinophils, DC, Cytotoxic cells, CD8 T cells and B cells were negatively correlated with TOP2A in almost all cancers (Fig. 7C) (P < 0.05). Besides, using XCELL, we found that CD4 + Th2 was positively correlated with TOP2A in almost all cancers, but stroma score, microenvironment score and immune score were negatively correlated with TOP2A in almost all cancers (Fig. 7D) (P < 0.05).

Fig. 7
figure 7

Immune subtype of TOP2A in pan-cancer. (A) Expression differences of TOP2A between six pan-cancer immune subtypes. (B) Relationship of TOP2A with ImmuneScore, StromaScore and ESTIMATE Score in pan-cancer. (C) Relationship of TOP2A with Immune cells in pan-cancer. *P ≤ 0.05. (D) Relationship of TOP2A with Immune cells in pan-cancer using XCELL. *P ≤ 0.05.

Immune checkpoint and tide score analysis of TOP2A in pan-cancer

The immune checkpoint is a biological mechanism that regulates the activity of the immune system and helps our body to distinguish between our own cells and foreign cells or pathogens. Certain cancer cells exploit immune checkpoint pathways to evade immune surveillance, a process known as immune checkpoint suppression. Therefore, we analyzed TOP2A in pan-cancer with various immune checkpoint correlations (Fig. 8A). Tumor Immune Dysfunction and Exclusion (TIDE) is a scoring system for assessing cancer immune escape and immune rejection. TIDE scores analyze cancer gene expression data to assess the presence of immune escape and immune rejection and to predict cancer response to immunotherapy. To this end, we analyzed the relationship between TOP2A and TIDE score in pan-cancer. And we found a strong correlation between TOP2A and TIDE Score in many cancer types (Fig. 8B). We validated the response of TOP2A and associated molecules to immunotherapy across different cancers. Our findings revealed that TOP2A exhibits enhanced effects in predicting immunotherapy response in certain cancer types (Fig. 8C).

Fig. 8
figure 8

Immune checkpoint and tide score analysis of TOP2A in pan-cancer. (A) Correlations between TOP2A with various immune checkpoint in pan-cancer. (B) Correlation between TOP2A and TIDE Score in pan-cancer. (C) Response of star molecules and TOP2A to immunotherapy of different cancers.

Pathway activity analysis and cell cycle analysis

A pathway is a series of biochemical reactions or signal transduction processes that collectively contribute to the physiological or pathological functions of a cell or tissue. These pathways encompass various mechanisms, including gene expression regulation, signal transduction, and metabolic processes. Investigating the pathways influenced by TOP2A is crucial to understanding its role in cellular processes and cancer progression. From the pathway relation network, shown in Fig. 9A, TOP2A channel-related genes were involved in apoptosis, cell cycle, DNA damage response, epithelial-mesenchymal transition (EMT), the hormone receptors AR and ER, and the PI3K/AKT, RAS/MAPK, RTK and TSCmTOR signaling pathways (Fig. 9A). In addition, we viewed the cell cycle changes in HepG2 cells after knocking down TOP2A by flow. Our findings demonstrated that TOP2A knockdown led to G0/G1 phase arrest in HepG2 cells (Fig. 9B-C).

Fig. 9
figure 9

Pathway activity analysis and cell cycle analysis. (A) Pathway relation network of TOP2A in pan-cancer. (B,C) Flow analysis after knockdown of TOP2A in HepG2 cell lines.

Discussion

DNA topoisomerase II alpha (TOP2A) is a nuclear enzyme encoded by the TOP2 gene and is responsible for maintaining DNA topology through reversible DNA strand breaks, rejoining, or both, during cellular processes such as cellular division, transcription and DNA repair6,10. The primary role of TOP2A is to facilitate stable genome inheritance, thereby allowing the efficient transmission of genetic material through disruption and rejoining6. Due to its involvement in genome regulation, aberrant expression or activity of TOP2A would lead to various disorders in multiple cell types, particularly cancer initiation, progression and metastasis. Top2a inhibitors have been extensively used in cancer therapy22,23,24. TOP2A has been observed to exhibit pan-cancerous characteristics, and its inhibitors have been widely used in clinical treatment25. Despite this, the precise role of TOP2A in cancer pathogenesis and immune infiltration remains unclear.

In this pan-cancer analysis, we validated the correlation between TOP2A expression and most types of cancer through public databases. We discovered aberrant expression of TOP2A in cancers by analyzing differential expression between malignant and adjacent tissues. Consistently, we found that TOP2A exhibited high expression in many carcinomas, except leukemia, melanoma, myeloma, pancreatic and breast cancer, which is in line with previous studies26,27,28. In order to reveal the correlation with long-term survival, we further focused on the expression level of TOP2A with the survival time via Kaplan–Meier survival. The results revealed that higher expression level was associated with a poorer prognosis in multiple cancers, as evidenced by the analysis of databases and clinical information. This finding is consistent with the fact that TOP2A has become one of the risk factors for predicting the predictive prognosis in cancers29. For example, TOP2A was considered an aging-related predictive risk gene for prostate cancer30, as well as a prognostic indicator for liver cancer31. Even in renal clear cell carcinoma, TOP2A is considered to be a prognostic signature gene particularly in the high-risk group with hyperactivated cell cycles. Furthermore, recruitment of TOP2A to nuclear stress bodies has been proven to protect against DNA damage after treatment, causing cancer cells more resistant to etoposide under stress conditions15. Conversely, downregulating of TOP2A expression has been linked to the progression of epithelial-mesenchymal transition via negative modulation of the β-catenin pathway, which harbored the proliferation and metastasis of cancer cells32. The imprisonment of TOP2A in a targeted manner has been found to obstruct double-stranded recombination, resulting in cell cycle arrest and a substantial decline in DNA damage repair when treated with chemotherapeutic drugs25. Upon reentry into the cell cycle, the expression of TOP2A showed once again a definite upward.

Cell cycle-related pathways were identified as the most significant for TOP2A, reflecting its critical role in mitosis. TOP2A alleviates torsional stress during chromosome preparation by forming transient DNA-protein crosslinks via tyrosine nucleophilic residues and covalent 5’-end bonding. It subsequently reseals cleaved DNA and restores duplex structure to resolve topological challenges. TOP2A is recruited to kinetochores by scaffolding proteins, such as RanGAP1, which protect it from degradation, ensuring proper chromosome segregation and accurate cell division.

TOP2A expression peaks during the S phase, correlating with replicative activity, while its levels are low in non-cycling cells. Flow cytometry confirmed that knockdown of TOP2A reduced the S-phase cell population, emphasizing its essential role in mitosis.

Apoptosis-related pathways were also significant for TOP2A. Telomere shortening triggers apoptosis upon reaching the critical Hayflick limit. However, telomere maintenance mechanisms, such as telomerase activity, allow cancer cells to evade apoptosis. TOP2A trapping induces alternative lengthening of telomeres, as observed in ALT-positive cell lines, further underscoring its involvement in cancer cell survival and proliferation.

Therefore, TOP2A inhibition has become the underlying mechanism for first-line chemotherapeutic drugs in a variety of cancers, such as doxorubicin, teniposide, idarubicin, mitoxantrone, etoposide and epirubicin18,33. These chemotherapeutic agents exhibit antimitotic action and function as an inhibitors of DNA transcription, replication, and damage repair, with the response being determined by the level of TOP expression34.

As is well-known, Th2 can secrete IL-4, IL-5, IL-10, and IL-13, which could inhibit the proliferation of Th1 cells and assist the activation of B cells to play the role of humoral immunity. These processes may partially explain the association between TOP2A and poor prognosis from an immunological perspective. Analysis of the TIDE score indicates that immune checkpoint blockade therapy targeting TOP2A would be highly effective in treating large B-cell lymphoma, thyroid cancer, and other cancers.

Overactivation of cell cycle pathways, particularly genes involved in the transition from G1 to S phase, is a hallmark of many cancers, including hepatocellular carcinoma.TOP2A expression positively correlates with the activation of cell cycle-related pathways, which underscores its critical role in promoting cell proliferation and cancer growth.DNA Damage Response (DDR) Pathway: The DDR pathway is critical for maintaining genomic stability, and its activation is often a response to DNA damage caused by various stressors, including chemotherapy. activation is often a response to DNA damage caused by various stressors, including chemotherapy. In our analysis, TOP2A expression was closely correlated with DDR pathway activity, suggesting that DNA breaks induced by TOP2A during replication may activate these repair pathways, thereby promoting cancer survival and chemotherapy tolerance. Develop personalized treatment strategies based on TOP2A status. For example, in cancers with high TOP2A expression or gene amplification, patients may benefit from a regimen that includes a TOP2A-targeted agent, which may be used in combination with other agents that inhibit DNA repair pathways or enhance the efficacy of chemotherapy. Conversely, if a TOP2A mutation or resistance mechanism is identified, targeted therapies or inhibitors of these reparative pathways may be used to restore sensitivity to conventional chemotherapeutic agents.

In cancer cells, the expression level and activity of TOP2A are co-regulated by multiple factors, including epigenetic modifications, mutations and signaling pathways35,36. These alterations can significantly impact crucial cellular processes such as cell cycle, DNA repair, and apoptosis, ultimately affecting cancer cell survival and growth. DNA methylation as an important form of epigenetic modification is a dynamically regulated process, where the hypermethylation of gene promoters has been proven to be a critical driver in cancer progression. According to our analysis, a significant association between TOP expression levels and DNA methylation in a substantial number of carcinomas, a phenomenon that could be clearly observed especially in OV. To clarify potential mechanisms further, we analyzed mutation counts, copy number variation (CNV), tumor mutation burden (TMB), and neoantigen. Our analysis showed that deletions and amplifications were the most common mutations. The rate of deletions even reaches about 80% in KICH, while the frequency of amplifications was about 70% in KIRP. In addition to renal cancer, both deletion and amplification were relatively common in UCS. Further investigation revealed that there was a significant correlation between copy number variation and TOP expression in up to 18 different cancer types. Previous research has identified two mutations of TOP2A in breast cancer patients who received immunotherapy treatment (e.g., anti-programmed cell death 1) via FUTURE trial37. Although no mutation was detected in the objective response to immunotherapy, the TOP2A mutation arose in cases of non-partial response, where structural alterations might occur and resulted in DNA being unable to bind in the functional regions37. Taken together, it is suggested that the generation of mRNA vaccines has great potential for cancer treatment encoding the TOP2A38. Additionally, immunotherapy response is closely linked to the tumor immune microenvironment (TME). It presents a possible opportunity for cancer cells to evade immune surveillance, whose key elements includes cancer cells, immune cells, immune checkpoints, stromal cells, and extracellular matrix39. The significance of TME has been increasingly emphasized in current research, as it impacts treatment outcomes and the prognosis of the disease. Our study shows the correlation of TOP expression with immune cells in different carcinomas using statistical methods. As is well-known, Th2 can secrete IL-4, IL-5, IL-10 and IL-13, etc., which could inhibit the proliferation of Th1 cells, and could assist the activation of B cells to play the role of humoral immunity. These processes may partially explain the association between TOP2A and poor prognosis in immunological perspective. Analysis of the TIDE score indicates that immune checkpoint blockade therapy of TOP2A would be highly effective in treating large B-cell lymphoma, thyroid cancer, and other cancers.

According to our results of pathway analysis, cell cycle-related pathways were shown to be the most prevalent and significant pathways for TOP2A, which derives from effects of TOP2A during the mitotic process. The generated super-helical requires TOP2A to relieve torsional stress when chromosomes are being prepared before mitosis starts. TOP2A promptly cleaves nucleic acids through forming a DNA-protein crosslinks that links the tyrosine nucleophilic residue on topoisomerases and 5’ end bonding via a covalent bond10. After efficient substrate resealing of DNA cleaved stumps, TOP2A rejoins smooth DNA duplexes to solve the topology problem at stake10. After preparation, TOP2A is recruited by the scaffolding protein at the kinetochores, and the presence of some molecules protect TOP2A from degradation, like RanGAP140. The emergence of TOP2A safeguards the proper segregation of chromosomes and accurate entry into daughter cells at the later stages. The introduce of transient double-strand resolves DNA catenations via intercalating into the DNA helix, guarding the smooth program of mitosis. It has been shown that abundance of TOP2A is highest in the S phase, which expression follows changes in the abundance of replicative activity. In contrast, TOP2A was in a low expression in non-cycling cells. Our flow cytometric analysis results have similarly corroborated the aforementioned theoretical findings that there was a notable decrease in the proportion of cells in the S phase following knockdown of TOP2A expression. This once again substantiates the crucial role of TOP2A in mitosis. Additionally, the most significant pathways for TOP2A were found to be apoptosis-related. Telomere shortening occurs as the end sequence of the chromosome DNA in the telomere section becomes irreproducible, ultimately leading to the initiation of apoptosis when the critical Hayflick limit is reached. However, the telomere maintenance mechanism allows for an unlimited reproduction profile of cancer cells, where ribonucleoprotein telomerase adds telomeric repeats to the chromosome ends. TOP2A protein trapping prevents shortening through inducing the alternative lengthening of telomeres, which could be detected in excess at telomere in alternative lengthening of telomeres-positive cell lines41.We have demonstrated in vitro that HepG2 cells are blocked in G0/G1 phase after knockdown of TOP2A.TOP2A knockdown of HepG2 later could further confirm the specific cause of the G0/G1 block, whether it is regulated through inhibition of the activity of the Cyclin-CDK complex, or by affecting other genes associated with G1 phase (e.g., p21, p27). Strategies for further clarifying whether TOP2A can be used as a therapeutic target in LIHC.

Summary, multiple cancers were discovered to express TOP2A at high levels, which was strongly associated with poor prognosis, particularly in OV. Aberrant expression was related to the HRD, LOH, MATH, MSI, RNAss, ploidy and immune cell infiltration. Therefore, TOP2A-based immunotherapy is expected to receive considerable attention, as it has already demonstrated a significant influence in databases.

Our study also has limitations. Specifically, we acknowledge that combining data from multiple sources, such as TCGA and GTEx, can introduce batch effects due to differences in experimental platforms, tissue preparation protocols, and sample handling. To mitigate this, we applied ComBat (a batch effect correction method) to adjust for batch-related variation between datasets. Additionally, we have addressed potential differences in data quality between TCGA and GTEx, noting that while both datasets are curated and widely used, variations in sample quality and processing could affect the accuracy of our analyses.