Abstract
Background
Genomic instability is crucial in tumorigenesis, with Tumour Mutation Burden (TMB) being a biomarker to indicate therapeutic effectiveness, particularly in immunotherapy. However, TMB is not always a reliable predictor and displays heterogeneity. Non-B DNA, susceptible to mutations, play a significant role in cancer development, indicating their potential merit when combined with mutation for enhanced markers in cancer.
Methods
We assessed mutations and non-B DNA interplay as biomarkers. Our methodology quantifies tumour mutations and their co-localization with non-B DNA, using survival and drug sensitivity assessments for clinical relevance.
Results
We introduce two novel markers, ‘nbTMB’ (non-B-informed tumour mutation burden) and ‘mlTNB’ (mutation-localised tumour non-B burden). In case studies: (1) nbTMB informs on survival heterogeneity among TMB-high patients undergoing immunotherapy whereas TMB is unable to further differentiate; (2) nbTMB informs on altered cisplatin sensitivity among ovarian cancer cell lines whereas TMB is unable to differentiate; and (3) mlTNB informs on survival heterogeneity among early-stage pancreatic cancer progressors in whom other markers of genomic instability fail to differentiate.
Conclusions
These novel markers offer a nuanced approach to enhance our understanding of treatment responses and outcomes in cancer, underscoring the need for a comprehensive exploration of the interplay between non-B and B-DNA features.
Similar content being viewed by others
Introduction
Genomic instability, marked by numerous genetic changes due to impaired DNA repair processes, stands as a hallmark of cancer [1]. It plays an important role in contributing to the cancer heterogeneity, therapeutic resistance and association with poor prognosis [2, 3]. Tumour Mutation Burden (TMB) stands as a crucial metric in oncology, quantifying the total mutations within tumour. It serves as a biomarker for the effectiveness of immunotherapies, particularly immune checkpoint inhibitors [4]. A higher TMB often indicates a more favourable response to immunotherapies, providing a vital tool for clinicians in customising therapeutic strategies and anticipating treatment outcomes [5]. However, the use of TMB and its threshold is not uniform across different cancer types, indicating the need for more nuanced exploration.
Within the genomic landscape, non-B DNA structures emerge as notable entities [6,7,8]. These structures deviate from the conventional B-DNA double helix to form alternative structures [6]. They disrupt the standard processes of DNA replication and transcription, thereby laying the foundation for genetic instability. The propensity of these structures to induce mutations underscores their critical role in cancer initiation and progression [9,10,11,12,13]. Their increased susceptibility to change gives rise to an abundance of population variants linked to non-B DNA motifs and an amplified frequency of somatic mutations at these sites, notably in cancer contexts. Even though numerous variants tied to non-B DNA motifs may not have a profound impact, these motifs play a pivotal role in the genetic diversity of the human genome [14, 15]. As a result, they stand out as primary areas of interest for disease development and genetic discrepancies [15]. It is essential to factor in the importance of non-B DNA motifs for predicting mutation frequencies and evaluating potential disease risks, while developing new biomarkers in the context of cancer.
In this study, we aim to introduce two novel biomarkers: nbTMB (non-B-informed TMB) and mlTNB (mutation-localised tumour non-B DNA burden). These biomarkers are designed to quantify the multifaceted nature of genomic instability by accounting for both tumour mutations and non-B DNA at the sample level. Our goal is to evaluate the nbTMB (a mutation burden) in terms of its ability to inform on the survival heterogeneity among patients with high TMB undergoing immunotherapy, and to investigate whether it could serve as a differentiating marker for drug sensitivity, particularly in the context of cisplatin treatment in ovarian cancer. Additionally, we aim to assess the potential of mlTNB (a non-B burden) as a marker for survival and outcome risk stratification in early-stage pancreatic cancer.
These aims are based on the hypothesis that a more nuanced approach, incorporating non-B DNA information, can enhance the predictive power of traditional biomarkers and potentially guide the development of targeted therapeutic strategies. Through our investigative lens, we seek to elucidate whether these novel biomarkers can provide a more detailed understanding of tumour behaviour and treatment responses, ultimately contributing to the advancement of precision oncology.
Results
The design of nbTMB and mlTNB, based on non-B and mutation co-localisation
Our investigation unveils two novel biomarkers: nbTMB (non-B-informed TMB) and mlTNB (mutation-localised tumour non-B DNA burden), aiming to quantify the multi-dimensions of genomic instability using both tumour mutations and non-B DNA at a sample level.
First, nbTMB quantifies mutations co-localised within non-B forming regions (Fig. 1a) as a non-B informed tumour mutation burden TMB. The mutation signatures are first extracted from each tumour profile and then mutation sites overlapped with non-B DNA motifs are further quantified at genomic-wide. We also calculate nbTMB percentage (nbTMBp) to describe the proportion of tumour mutations co-localised with non-B DNA motifs, relative to total TMB.
a Schematic representation of non-B-informed mutation quantification, showing the calculation of nbTMBp and its role as an indicator of TMB composition, bTMB (traditional Tumour Mutation Burden), and nbTMB (non-B-informed Tumour Mutation Burden), emphasising the fraction of mutations associated with non-B DNA structures. b Kaplan-Meier survival analysis contrasting overall survival (OS) in patients for Pan-Can patients (N = 867) with high vs. low TMB undergoing immunotherapy, highlighting the prognostic significance of TMB levels. The high TMB is define defined as the top 20% within each cancer type. c Stratification of pan-cancer patient survival outcomes by TMB levels, with a secondary differentiation based on OS status. Among pan-can patients categorised by TMB levels (high or low), a further distinction into ‘alive’ and ‘deceased’ based on overall survival reveals that the deceased cohort consistently exhibits a higher nbTMBp percentage across both TMB categories. d Venn diagram depicting the gene overlap between the MSKCC-Panel-468, genes with non-B DNA mutations and genes linked to immune checkpoint inhibitor (ICI) response, suggesting a potential connection between non-B DNA mutations and immunotherapy effectiveness. e, f Comparison of survival rates within the high TMB patient subset, showing that higher nbTMBp is associated with decreased survival, thereby offering additional stratification within this group. g, h Analysis within the high TMB category, indicating that further dissecting patients on TMB exclusively does not significantly stratify patients’ survival outcomes. i, j Clustering of ovarian cell lines by nbTMBp to identify distinct groups with varying levels of nbTMBp, potentially reflecting differential drug sensitivity. k nbTMBp shows a linear trend of increasing drug resistance of Cisplatin. This is evident in both IC50 metrics (where a lower value indicates increased sensitivity) and dose-response AUC (where a higher value indicates increased sensitivity). It demonstrates that increased nbTMBp correlates with heightened resistance to Cisplatin in ovarian cancer cell lines, suggesting the utility of nbTMBp in predicting drug response. Such a correlation is absent in the case of another platinum-based compound Carboplatin. l Heatmap and ridge plot illustrating the lack of a clear relationship between TMB alone and drug sensitivity of Cisplatin and Carboplatin in ovarian cancer cell lines, indicating the potential advantage of incorporating nbTMBp for more refined predictions. m Distribution of ovarian cancer cell lines across TMB-based clusters, with the colour gradient representing the range of TMB values. n Box plots showing drug sensitivity metrics across TMB-defined clusters for Cisplatin and Carboplatin, indicating the absence of a consistent pattern between TMB and drug response, thus underscoring the potential value of nbTMBp in therapeutic decision-making.
Second, mlTNB refers to mutation-localised tumour non-B DNA burden as a quantification of non-B DNAs. Different from nbTMB, the marker, mlTNB, focuses on the counts of non-B motifs that contain mutation sites (Fig. 2a). Due the various non-B types, the mlTNB is further calculated by non-B motif types. For the comparison across non-B types and across samples, the burden value will also be normalised by both the number of mutations and the motif library size.
a Diagram delineating the categorisation of tumour non-B DNA burdens: total non-B burden (TNB), mutation-localised tumour non-B burden (mlTNB) and mutation-free tumour non-B burden (mfTNB). b Clustering analysis of early-stage pancreatic cancer patients with progression, resulting in seven distinct patient sample clusters. Each group is characterised by unique mlTNB loads, which are differentiated by the type of non-B DNA motifs present, such as direct repeats (DR), G-quadruplexes (G4), inverted repeats (IR), mirror repeats (MR), short tandem repeats (STR) and Z-DNA (ZDNA). c Kaplan-Meier plots illustrating progression-free survival for the seven patient clusters identified, providing insight into the prognostic value of mlTNB categorisation. d Summary of pathway enrichment analysis, which underscores the predominant gene mutation signatures associated with each of the mlTNB clusters, potentially linking molecular pathways with patient prognosis. e Comparative overview of the distribution of fractional genome alteration (FGA), TMB and microsatellite instability (as measured by MANTIS score) across the clusters, indicating a lack of significant variation among the groups, which all show low TMB (median TMB < 2), low FGA (median < 30%) and MSS (median MANTIS score < 0.3), underscoring mlTNB as a driving factor stratifying the patients. f Comparison of PFS between the cluster with the most extended median PFS (mlTNB-DR, cluster 3) and the cluster with the shortest median PFS (mlTNB-ZDNA, cluster 5), emphasising the differential impact of non-B DNA motif types on patient outcomes. g Visual representation of the chromosomal distribution of non-B DNA and mutation co-localisations contributing to the mlTNB burden across different chromosomes. The x-axis represents samples, and the y-axis represents chromosomes. Each column corresponds to a sample, and each coloured grid indicates a mutation/non-B DNA co-localisation within a specific chromosome for that sample.
The two markers are calculated for each tumour profile at sample level. The mutation signatures are extracted from each tumour profile. The genomic-wide non-B forming region are further overlapped with the mutated regions for each tumour profile. Using the overlapped region by counting separately the number of mutation and non-B motifs involved, we are able to derive the two metrics, nbTMB and mlTNB, as the new biomarker to reflect the interplay between mutation and non-B DNA. The metric was further refined by optional normalisations to predict patient prognosis and treatment responses.
In this study, nbTMBp was applied to immunotherapy treatment in ovarian cancer due to the high heterogeneity observed in patient responses within the high-TMB group. nbTMBp refines TMB by quantifying the percentage of mutations co-localised with non-B DNA, providing additional stratification. Conversely, mlTNB was developed for pancreatic cancer, an ‘immune-cold’ cancer with typically low TMB. mlTNB focuses on non-B DNA burden, accounting for the impact of non-B DNA structures on genomic instability.
nbTMB differentiates survival among TMB-High patient post-immunotherapy
We first describe a Pan-Can immunotherapy analyses in which nbTMBp appears linked with prognosis. High TMB has been reported to be associated with improved immunotherapy response [16,17,18]. However, within TMB-high patients, outcomes remain heterogenous. Herein, we explore the heterogeneity with TMB high/low groups using nbTMBp to investigate its role as a biomarker associated with post-immunotherapy survival.
We analysed the mutation data from patients who underwent immunotherapy based on the MSK-IMPACT study from 11 different cancer types [17]. Within each caner type, using the 80th percentile of TMB, we assigned patients into TMB -high and -low groups and compared their overall survival (OS) (Fig. 1b). We further stratified patients within each of TMB-High and -Low groups based on their status—alive/censored or deceased. We defined nbTMB for each patient sample by quantifying the numbers of mutations co-localised with non-B forming regions [19]. When comparing median nbTMBp across groups, the TMB-high group had a lower nbTMBp overall, relative to the TMB-low group (Fig. 1c). Within each TMB classified group, median nbTMBp was significantly higher in deceased patients, irrespective of their high/low status (Fig. 1c). A gene-level analysis of immune response signatures [20] revealed an 86% overlap between mutations co-localised with non-B motifs and immune checkpoint inhibitor-outcome-linked genes (n = 98) (Fig. 1d).
Next, we performed clustering on nbTMBp within the TMB-High patients, which revealed two patient subgroups (Fig. 1e), one with median nbTMPp of around 10% that was associated with significantly (p < 0.01) shorter OS as compared to TMB-High patients with less than 10% nbTMBp (Fig. 1f). For comparison, the same analysis was applied to TMB in which no significant OS difference was observed (Fig. 1g–h). Altogether, our findings lend support for the further study of nbTMBp as a potential marker of differential survival within TMB-high patients on immunotherapy. These results may reflect the potential contribution from non-B DNA genomic instability in some patients that have poor survival, despite having high TMB.
Increasing nbTMB is associated with decreased cisplatin sensitivity in ovarian cancer
We next explore the use of nbTMBp as a marker of altered cisplatin drug sensitivity in ovarian cancer. Cisplatin resistance is a major hurdle in effectively treating ovarian cancer [21]. Although cisplatin is commonly used for ovarian cancer treatment, drug resistance often arises due to a faulty apoptotic process, reducing treatment effectiveness [22,23,24,25,26]. Among the 57 ovarian cell lines with mutation profiles [27], ~40% have TMB greater than ten and a median fraction of genome altered (FGA) of ~50%, which supports the potential role of genomic instability in treatment resistance. Investigating how cells signal in response to chemotherapy from the non-B DNA perspective of genomic instability may shed light on treatment outcomes.
We defined an ovarian cell line specific mutation signature and corresponding nbTMBp for use in a cluster analysis that identified three cell line groups of varying (low to high) nbTMBp (Fig. 1i, j). Median nbTMBp significantly differed among the three clustered cell line groups. Tests of association between TMB, FGA and tumour grade with nbTMBp-derived cell line clusters lacked significance, as did a correlation between TMB and nbTMBp among the ovarian cell lines. We examined the effect of clusters on cisplatin drug sensitivity in which increasing nbTMBp was significantly associated with decreasing cisplatin sensitivity. This finding was consistent for dose-response AUC with cisplatin (Fig. 1k). For comparison, we performed the same analyses on carboplatin sensitivity which did not show the same result, suggesting a cisplatin specific nbTMBp effect. Additionally, the same cluster analysis on TMB failed to show a similar result (Fig. 1l, m). Altogether, our findings show support for the further exploration of nbTMBp as a potential marker of cisplatin sensitivity that may help to explain resistance when all other markers indicate otherwise.
mlTNB differentiates survival among early-stage pancreatic patient progressors
In contrast to nbTMBp, we next explore the use of mlTNB to quantify non-B burden and its association with survival in pancreatic adenocarcinoma (PAAD). PAAD is a highly aggressive cancer with poor outcome. Existing genomic instability measures have not proven informative in differentiating survival into clinically translatable patient groups for risk stratification [28]. As opposed to focusing on mutation numbers, mlTNB quantifies non-B DNA regions that contain mutation sites which can be classified according to non-B structure types to provide a more nuanced perspective [7] (Fig. 2a).
Using the mutation profiles of 76 TCGA early-stage pancreatic patients who progressed, we quantified mlTNB for each sample and used it in a cluster analysis resulting in seven patient groups with differentiated non-B structure types (Fig. 2b) that significantly differed in progression-free survival (PFS) (Fig. 2c). Patients with high mlTNB characterised mainly by direct repeat regions (high mlTNB-DR burden) was associated with the longest PFS (n = 7, median = 25 months), while patients with high mlTNB in Z-DNA regions had the shortest (n = 10, median = 5 months). PFS among other groups were similar: the patient group with high mlTNB from short tandem repeat regions (STR) (n = 13, median = 11 months); the sample group with mlTNB from mirror repeats (MR) without inverted repeats (IR) (n = 7, median = 12 months); and the group with MR with IR (n = 9, median = 15 months).
According to the pathway analysis on gene signatures of sample clusters (see Methods), the high mlTNB-DR burden patients had mutation signatures enriched in MAPK and Notch signalling pathways, as compared to the other clusters enriched with double-stranded break and mismatch repair (cluster 1-IR), hedgehog and WNT signalling (cluster 2-STR) and interleukin-4 signalling (cluster 6-MR) pathways (Fig. 2d). There was a lack of significant association between mlTNB clusters with age, race, sex, PAAD subtypes [29, 30], KRAS mutation status and tumour purity. Additionally, there was no significant association between the non-B DNA clusters with markers of genomic instability, TMB, FGA, and MSI-score (measured by MANIS score) [31], whose distributions were similar across clusters and on average, were low in value (Fig. 2e). Specifically, the median TMB was 1.43, the median FGA, 0.13 and the median MSI-score was 0.28 across all groups. In the shortest PFS (high mlTNB-ZDNA, cluster 5-ZDNA) (Fig. 2f), chromosome 7 had the highest prevalence of non-B mutation co-localisation, while the longest PFS (high mlTNB-DR, cluster 3-DR) patient group non-B mutation co-localisation resided mainly on chromosome 5 (Fig. 2g). Our results lend support to the further study of mlTNB as a differentiating marker of survival in PAAD patients that may further inform on their heterogeneous response to treatment.
Discussion
We explored the integration of non-B DNA information within mutation burden in cancer by the introduction of two makers: nbTMB and mlTNB. Our analysis demonstrates the distinct applications of nbTMBp and mlTNB in different cancer contexts. While nbTMBp enhances the predictive power of TMB in immunotherapy response, mlTNB provides valuable insights into survival outcomes in pancreatic cancer by emphasising non-B DNA burden. The results show the added value from inclusion of non-B DNA data, lending support to the multi-dimension roles of genomic instability arising from non-B structures and their potential impact on cancer prognosis and treatment efficacy.
Innovation
A highlight of this study revolves around the predictive potential of these biomarkers. For instance, the differentiating capacity of nbTMBp in determining immunotherapy response underscores the importance of decomposing mutation burden into multiple features, such as non-B DNA. Similarly, the insights derived from the ovarian cancer case, where an increasing nbTMBp revealed heightened cisplatin sensitivity, highlights the potential treatment associated response. On the other hand, non-B-specific structures associated with mlTNB presents prediction capability in patient outcome in early-stage pancreatic cancer. Within our discussion, we have underscored the promising potential of integrating non-B DNA information with genomic variant assessments. This integration represents a significant step towards a more sophisticated understanding of tumour genomics and may offer a valuable complement to existing TMB refinements, such as analyses focused on mutations exclusively. Our approach rests on the hypothesis that the genomic landscape, which is intricately shaped by mutation load coupled with the presence of non-B DNA structures, can provide profound insights into tumour behaviour and responsiveness to treatment. The data presented herein suggests that our biomarkers, nbTMB and mlTNB, by capturing the complexity of the tumour genome—beyond simple mutation counts—have the potential to convey unique prognostic information. Such information may remain concealed if the assessment is limited to mutation loads alone. While our study did not perform a direct comparison with every variant of TMB enhancement, the initial findings advocate for the value of our biomarkers and set the stage for subsequent comparative analyses. These future studies will be crucial in delineating the relative advantages of different TMB assessment strategies, furthering our collective goal of refining precision oncology tools.
Mechanism research
While we demonstrated the patient grouping with clustering analyses, optimising the thresholds for biomarkers with continuous value in clinical application remains a challenge. While the correlation between nbTMB/mlTNB and treatment responses is compelling, the mechanism remains unexplored. Future prospective studies are essential to elucidate the mechanisms underlying the observed associations as well as to validate the clinical applicability of these biomarkers. The potential hypothesis includes that the frequency and specific localisation of mutagenesis at non-B DNA motifs could influence genomic stability and, consequently, tumour behaviour [10]. Additionally, the interplay between tandem repeats, microsatellite instability and DNA repair processes could elucidate the propensity for cancer development and progression [32]. Future research, building on our findings, should aim to dissect these complex interactions. For instance, studies could focus on characterising the mutational landscape in the vicinity of non-B DNA structures and investigating the biological consequences of such mutations. Understanding the impact of non-B DNA on the functionality of DNA repair pathways could unveil new therapeutic targets. We explored the potential of nbTMBp as a marker among patients with high-TMB who received immunotherapy across various cancer types. Trends were observed in specific cancer types, such as Glioma and Urothelial cancer, where a higher proportion of patients with elevated nbTMBp were found to have a deceased status. But future studies with larger sample sizes will be needed to validate these preliminary observations. Furthermore, longitudinal studies tracking the evolution of these genomic features in response to treatment could shed light on their role in treatment resistance and disease recurrence. While our study focused on the co-localisation between non-B DNA motifs and mutations for understanding the interactions that influence cancer progression and treatment response, the inclusion of non-B DNA motifs without mutations as a genomic control is conceptually important. Future work will explore genomic regions containing non-B DNA structures without mutations to provide a broader context and validate our findings further. It is our hope that this initial exploration serves as a catalyst for more detailed investigations into the biological underpinnings of non-B DNA-related genomic instability. Such work is essential not only for the advancement of fundamental cancer biology but also for the translation of these genomic insights into practical, clinical interventions that can significantly improve patient outcomes.
Clinical utility
While our initial findings are promising, we recognise the need for a comprehensive evaluation in a more diverse array of cancers and demographic backgrounds to solidify the biomarkers’ universal relevance. Future studies should aim to encompass a broader dataset, particularly those inclusive of varied immunotherapy responses, to determine the robustness of our biomarkers across different genomic landscapes and treatment regimens. Such expansion is crucial not only for validating the efficacy of nbTMB and mlTNB but also for exploring their prognostic capabilities across a wider patient demographic.
For the clinical workflow integration, transitioning from research to real-world clinical application, the integration of nbTMB into clinical workflows raises important considerations. The detection of non-B regions requires genome-wide sequencing capabilities, which are becoming increasingly available in clinical settings. However, the implementation of such comprehensive testing necessitates the development of streamlined protocols and cost-effective methodologies. One potential avenue to explore is the creation of panel-wise nbTMB, which would parallel the current clinical approach to TMB, allowing for a more targeted and economical application while retaining the enriched information provided by non-B DNA contexts. The logistics of incorporating these biomarkers into standard care must be addressed to ensure their practicality and accessibility for clinical use.
In summary, we present two cancer biomarkers of non-B DNA associated burden that, with the role of non-B DNA in genomic instability, offers a novel expansion over existing markers for potentially explaining heterogenous outcomes and treatment responses in cancer.
Methods
The calculation of nbTMB
nbTMB (non-B co-localisation tumour mutation burden)
nbTMB focuses on mutations, particularly those co-localised with non-B regions. The steps to calculate this metric include: (1) identify mutation signature within the dataset for each sample; (2) examine each mutation sites to determine whether it is co-localised with non-B motifs; (3) count all mutations that fall within these regions to derive the nbTMB value. As a derived metric, nbTMB Percentage (nbTMBp) provides a normalised perspective of nbTMB in relation to the total TMB to quantify the information complexity of the TMB. It is calculated using the formula:
Where:
nbTMB = Number of mutations co-localised with non-B regions.
TMB = Total tumour mutation burden for a given sample.
The calculation of mlTNB
mlTNB (mutation co-localised non-B Burden)
mlTNB is a metric that quantifies the burden of mutations co-localised with non-B motifs. To calculate mlTNB (mutation-localised tumour non-B Burden), we followed these steps: (1) Identify the mutation signature for each sample, including SNVs and indels. (2) Examine the genomic positions of non-B DNA motifs and determine if they overlap with mutation sites. (3) Count all non-B DNA motifs containing at least one mutation to derive the mlTNB value.
We chose a cutoff of one mutation to be inclusive and retain more signals, given the average length of non-B DNA (~25 bases). This decision ensures comprehensive capture of significant mutations within non-B DNA motifs.
For extension, considering there are multiple type non-B type such as G4, H-DNA, Z-DNA and so on, mlTNB can be derived for each of the non-B type specifically. Additionally, to enable comparison between samples and non-B types, the mlTNB is further normalised by the mutation size factor (divide the counts by total mutation length in each sample) and the non-B library factor (divide the counts by the total size of each non-B type). The calculation is described below:
Where:
mlTNB = mutation-localised tumour non-B burden
The mutation co-localised non-B Burden is calculated using the ‘burden in batch’ function from the web server NBBC (Non-B DNA Burden in Cancer) [7].
Mutation signatures for CCLE and TCGA
The mutation data was extracted from two major repositories: the Cancer Cell Line Encyclopaedia (CCLE) and The Cancer Genome Atlas (TCGA) [33]. Both sources offer a comprehensive view of mutational landscapes. The mutation data for patient samples and cell lines are separately downloaded from UCSC Xena [34]. The specific dataset used for our analysis was identified as CCLE_DepMap_18Q2_maf_20180502. It contains somatic mutations include Single Nucleotide Polymorphisms and small INDELs (insertions/deletions). Mutational calls have been merged, focusing on the coding region and filtering out germline mutations. For patient mutation data, the somatic mutation dataset from TCGA is labelled as ‘mc3.v0.2.8.PUBLIC.maf.gz’ [35]. The genome assembly for both datasets is hg19 build. Early-stage pancreatic cancer TCGA patients were defined by Stage I or II pancreatic adenocarcinoma with a predicted tumour purity [36] of at least 25% and no history of neuroendocrine diagnoses, prior treatments, or non-pancreatic origins [37].
Non-B forming motifs data source
The data for non-B motifs was originally sourced from the Non-B DB 2.0 database [38] and downloaded the NBBC (non-B burden in cancer) web server [7]. The dataset comprises seven non-B structures: A-phased repeats (n = 2386), G-quadruplex motifs (n = 361,232), Z-DNA motifs (n = 404,192), inverted repeats (n = 5,771,570), mirror repeats (n = 1,378,864), direct repeats (n = 1,113,354) and short tandem repeats (n = 2,826,360).
Genomic and survival data for Immunotherapy Patients
The Pan-Can immunotherapy data was sourced from processed mutation and clinical data using the Memorial Sloan Kettering Cancer Center (MSKCC) project available through cBioPortal [39]. This dataset includes genomic and survival information from 1,661 tumour-normal pairs, covering a diverse range of cancer types. All samples within this collection were sequenced using the MSK-IMPACT assay.
Clustering analysis
The K-means clustering was performed using the baseR kmeans function. The visualisation of clustering results in heatmaps are further conduct by the ComplexHeatmap package [40]. The K-means algorithm is sensitive to the magnitude of the data, and variables with larger scales can dominate the clustering results. Before the application of the K-means algorithm, the dataset was standardised to ensure that each feature contributed equally to the clustering process. The optimal number of clusters is determined be consensus clustering and evaluation methods including the elbow method and silhouette scores [41].
Pathway analysis on gene signatures of sample clusters
While calculating the mlTNB for each tumour sample, we identified mutations that colocalize with each type of non-B DNA motif. Based on these mutations, we extracted the genes where these mutations are located and formed a signature for each non-B DNA type in each sample. For each patient cluster containing several samples, we created a union of the genes from each sample’s signature to form a gene set for each cluster. This gene set was then used to perform pathway analysis for each cluster.
Drug sensitivity and survival comparison
The drug sensitivity data was obtained from the CREAMMIST database [42]. The survival data for TCGA was downloaded from cBioportal under TGCA PanCan Altlas Studies [39, 43]. The boxplot visualisations are generated with ggplots [44] and ggpubr [45] package. The survival analysis is conduct by survival and ggsurvfit R packages.
The definition of high-TMB status and the potential use of nbTMB in high-TMB status
The determination of high tumour mutation burden (high-TMB) status is adapted to be cancer type-specific rather than a fixed universal threshold. Recognising the variability of TMB benchmarks across different cancer types, high-TMB is defined as the top 20th percentile within each cancer type’s TMB distribution. This percentile-based approach ensures that the high-TMB classification is relevant and reflective of the mutational landscape inherent to each cancer type.
The potential use of non-B-informed tumour mutation burden (nbTMB) in high-TMB status serves as a novel and refined stratification method. Within the cohort of patients classified as high-TMB by this relative definition, nbTMB provides an additional layer of distinction by quantifying the proportion of mutations co-localised with non-B DNA motifs. This biomarker is designed to identify subgroups of high-TMB patients who may have different clinical outcomes due to the influence of non-B DNA regions on genomic stability and treatment response. Through this methodology, we aim to explore the prognostic implications of nbTMB in high-TMB patients across various cancer types.
Data availability
The immunogenomic dataset derived from Samstein et al.’s cohort are available at https://www.cbioportal.org/study/summary?id=tmb_mskcc_2018. The scripts for analyses in this study is available under a GPL-3.0 license in the Kowalski Lab GitHub repository (https://github.com/kmlabdms/MLNB).
References
Andor N, Maley CC, Ji HP. Genomic instability in cancer: teetering on the limit of tolerance. Cancer Res. 2017;77:2179–85.
Negrini S, Gorgoulis VG, Halazonetis TD. Genomic instability—an evolving hallmark of cancer. Nat Rev Mol Cell Biol. 2010;11:220–8.
Shen Z. Genomic instability and cancer: an introduction. J Mol Cell Biol. 2011;3:1–3.
Valero C, Lee M, Hoen D, Wang J, Nadeem Z, Patel N, et al. The association between tumor mutational burden and prognosis is dependent on treatment context. Nat Genet. 2021;53:11–15.
Karamitopoulou E, Andreou A, Wenning AS, Gloor B, Perren A. High tumor mutational burden (TMB) identifies a microsatellite stable pancreatic cancer subset with prolonged survival and strong anti-tumor immunity. Eur J cancer. 2022;169:64–73.
Wang G, Vasquez KM. Dynamic alternative DNA structures in biology and disease. Nat Rev Genet. 2023;24:211–34.
Xu Q, Kowalski J. NBBC: a non-B DNA burden explorer in cancer. Nucleic Acids Res. 2023;51:W357–64.
Xu Q, Kowalski-Muegge J. Mutation-site localized non-B DNA burden and survival heterogeneity in early-stage pancreatic cancer. J Clin Oncol. 2023;41:4166.
Georgakopoulos-Soares I, Morganella S, Jain N, Hemberg M, Nik-Zainal S. Noncanonical secondary structures arising from non-B DNA motifs are determinants of mutagenesis. Genome Res. 2018;28:1264–71.
McGinty RJ, Sunyaev SR. Revisiting mutagenesis at non-B DNA motifs in the human genome. Nat Struct Mol Biol. 2023. https://doi.org/10.1038/s41594-023-00936-6.
Guiblet WM, Cremona MA, Harris RS, Chen D, Eckert KA, Chiaromonte F, et al. Non-B DNA: a major contributor to small- and large-scale variation in nucleotide substitution frequencies across the genome. Nucleic Acids Res. 2021;49:1497–516.
Bacolla A, Ye Z, Ahmed Z, Tainer JA. Cancer mutational burden is shaped by G4 DNA, replication stress and mitochondrial dysfunction. Prog Biophys Mol Biol. 2019;147:47–61.
Xu Q, del Mundo IMA, Zewail-Foote M, Luke BT, Vasquez KM, Kowalski J. MoCoLo: a testing framework for motif co-localization. Brief Bioinforma. 2024;25:bbae019.
Makova KD, Weissensteiner MH. Noncanonical DNA structures are drivers of genome evolution. Trends Genet. 2023;39:109–24.
Bansal A, Kaushik S, Kukreti S. Non-canonical DNA structures: diversity and disease association. Front Genet. 2022;13:959258.
Fusco MJ, West HJ, Walko CM. Tumor mutation burden and cancer treatment. JAMA Oncol. 2021;7:316.
Samstein RM, Lee C-H, Shoushtari AN, Hellmann MD, Shen R, Janjigian YY, et al. Tumor mutational load predicts survival after immunotherapy across multiple cancer types. Nat Genet. 2019;51:202–6.
Rousseau B, Foote MB, Maron SB, Diplas BH, Lu S, Argilés G, et al. The spectrum of benefit from checkpoint blockade in hypermutated tumors. N Engl J Med. 2021;384:1168–70.
Cer RZ, Donohue DE, Mudunuri US, Temiz NA, Loss MA, Starner NJ, et al. Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools. Nucleic Acids Res. 2013;41:D94–D100.
Long J, Wang D, Wang A, Chen P, Lin Y, Bian J, et al. A mutation-based gene set predicts survival benefit after immunotherapy across multiple cancers and reveals the immune response landscape. Genome Med. 2022;14:20.
Mansouri A, Zhang Q, Ridgway LD, Tian L, Claret FX. Cisplatin resistance in an ovarian carcinoma is associated with a defect in programmed cell death control through XIAP regulation. Oncol Res. 2003;13:399–404.
Song M, Cui M, Liu K. Therapeutic strategies to overcome cisplatin resistance in ovarian cancer. Eur J Med Chem. 2022;232:114205.
Herr I, Debatin KM. Cellular stress response and apoptosis in cancer therapy. Blood. 2001;98:2603–14.
Makin G, Dive C. Apoptosis and cancer chemotherapy. Trends Cell Biol. 2001;11:S22–26.
Parker RJ, Eastman A, Bostick-Bruton F, Reed E. Acquired cisplatin resistance in human ovarian cancer cells is associated with enhanced repair of cisplatin-DNA lesions and reduced drug accumulation. J Clin Investig. 1991;87:772–7.
Havasi A, Cainap SS, Havasi AT, Cainap C. Ovarian cancer-insights into platinum resistance and overcoming it. Medicina. 2023;59:544.
Ghandi M, Huang FW, Jané-Valbuena J, Kryukov GV, Lo CC, McDonald III, et al. Next-generation characterization of the cancer cell line encyclopedia. Nature. 2019;569:503–8.
Raphael BJ, Hruban RH, Aguirre AJ, Moffitt RA, Yeh JJ, Stewart C, et al. Integrated genomic characterization of pancreatic ductal adenocarcinoma. Cancer Cell. 2017;32:185–203.e113.
Bailey P, Chang DK, Nones K, Johns AL, Patch AM, Gingras MC, et al. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. 2016;531:47–52.
Moffitt RA, Marayati R, Flate EL, Volmar KE, Loeza SGH, Hoadley KA, et al. Virtual microdissection identifies distinct tumor-and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nat Genet. 2015;47:1168–78.
Kautto EA, Bonneville R, Miya J, Yu L, Krook MA, Reeser JW, et al. Performance evaluation for rapid detection of pan-cancer microsatellite instability with MANTIS. Oncotarget. 2017;8:7452.
Polleys EJ, Freudenreich CH. Homologous recombination within repetitive DNA. Curr Opin Genet Dev. 2021;71:143–53.
Ghandi M, Huang FW, Jane-Valbuena J, Kryukov GV, Lo CC, McDonald ER, et al. Next-generation characterization of the cancer cell line encyclopedia. Nature. 2019;569:503–8.
Goldman M, Craft B, Zhu J, Haussler D. The UCSC Xena system for cancer genomics data visualization and interpretation. Cancer Res. 2017;77:2584–2584.
Ellrott K, Bailey MH, Saksena G, Covington KR, Kandoth C, Stewart C, et al. Scalable open science approach for mutation calling of tumor exomes using multiple genomic pipelines. Cell Syst. 2018;6:271–281.e277.
Qin Y, Feng H, Chen M, Wu H, Zheng X. InfiniumPurify: an R package for estimating and accounting for tumor purity in cancer methylation research. Genes Dis. 2018;5:43–45.
Raphael BJ, Hruban RH, Aguirre AJ, Moffitt RA, Yeh JJ, Stewart C, et al. Integrated genomic characterization of pancreatic ductal adenocarcinoma. Cancer Cell. 2017;32:185–203.e113.
Cer RZ, Donohue DE, Mudunuri US, Temiz NA, Loss MA, Starner NJ, et al. Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools. Nucleic Acids Res. 2012;41:D94–D100.
Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6:pl1.
Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32:2847–9.
Rupji M, Dwivedi B, Kowalski J. NOJAH: NOt Just Another Heatmap for genome-wide cluster analysis. PloS ONE. 2019;14:e0204542.
Yingtaweesittikul H, Wu J, Mongia A, Peres R, Ko K, Nagarajan N, et al. CREAMMIST: an integrative probabilistic database for cancer drug response prediction. Nucleic Acids Res. 2022;51:D1242–48.
Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2:401–4.
Wickham H. ggplot2. Wiley Interdiscip Rev Comput Stat. 2011;3:180–5.
Kassambara A, Kassambara MA. Package ‘ggpubr’. R package version 0.1. 6 (2020).
Acknowledgements
This work was supported by grant [RR160093 to support JK], research funds from the Department of Oncology, Dell Medical School [to JK].
Author information
Authors and Affiliations
Contributions
JK conceived of the idea, directed the analyses plan, and edited the manuscript. QX performed the analyses, drafted the figures and initial manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Xu, Q., Kowalski, J. Non-B DNA-informed mutation burden as a marker of treatment response and outcome in cancer. Br J Cancer 131, 1825–1832 (2024). https://doi.org/10.1038/s41416-024-02873-7
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41416-024-02873-7