Introduction

Acute lymphoblastic leukemia (ALL) is hallmarked by a large variety of subtypes with different prognoses and treatment responses. Typically, the primary, subtype-defining driver is a gene fusion or aneuploidy [1] and unlike most cancers, ALL at initial diagnosis has a distinctly low mutational load [2]. However, a subset of ALL cases show elevated mutation levels due to activity of several mutational processes. Activity of these mutational processes can be revealed by identifying associated mutational signatures of single base substitutions (SBSs), double base substitutions (DBSs) and insertion/deletion mutations (indels) [3]. Some of these mutational processes appear to be enriched in specific ALL subtypes. For example, about a third of ETV6::RUNX1-positive cases is affected by APOBEC mutagenesis and 50–70% of intrachromosomal amplification of chromosome 21 (iAMP21) cases and ~15% of hyperdiploid cases are affected by UV-like DNA damage [4, 5]. APOBEC-associated mutagenesis is proposed to result from deregulation of APOBEC cytosine deaminases that under normal circumstances target single-stranded viral DNA and RNA as part of the innate immune system [6]. Damage by reactive oxygen species (ROS) has been found sporadically in multiple ALL subtypes and affects ~40% of PAX5alt cases and ~60% of TALLMO cases, but so far only a potential association to 20q and 9p deletions has been suggested [5].

We and others have recently noticed an elevated mutational load in relapsed ALL [7,8,9,10], which was especially apparent in second relapses, which could partly be explained by therapy-related effects [7]. Thiopurine treatment was found to induce specific patterns of mutations in relapsed ALL, depending on whether the ALL is mismatch repair-proficient or -deficient [8, 11]. Currently, our understanding of the type and timing of mutational mechanisms in relapsed ALL is limited, and it is unclear to what extent mutational processes contribute to disease progression.

At initial diagnosis, ALL is primarily localized to the bone marrow, with extramedullary localizations becoming more frequent in relapses [12, 13]. While overall survival and cumulative incidence of relapse have steadily improved in upfront treatment protocols [14], recent relapse treatment protocols have lower overall survival and high relapse rates in both standard risk arms [15, 16] and high risk arms [17, 18]. These numbers suggest that, although fewer patients relapse, second relapses are still a common event within the relapsed ALL group. Treatment of second relapses often represents an even greater challenge, especially if stem cell transplantation and chimeric antigen receptor T-cell therapy have already been performed following initial diagnosis or first relapse [19, 20]. Due to the multiple subsequent tumors, multiply relapsed patients offer a unique opportunity to investigate the prevalence, impact and timing of mutational processes in relapsed ALL, specifically in regards to treatment-related mutagenicity, as these patients have been treated extensively and repeatedly.

We collected a cohort of 29 Dutch ALL patients across multiple treatment protocols who had multiple relapses. Using whole genome sequencing of the sequential tumor samples of each patient we were able to distinguish the mutational processes active in relapsed ALL and track the activity of mutational processes over time. This allowed us to investigate the timing and duration of mutational processes during disease progression as well as their impact on relapse development.

Materials and methods

Cohort

Twenty-nine multiply relapsed patients, with ≥2 bone marrow relapses, were included. A cohort of in-house produced whole genome sequencing of 114 ALL initial diagnosis samples and four ALL relapse samples was used as reference.

Sequencing

Whole genome sequencing was performed on Illumina Novaseq 6000 (Illumina, San Diego, CA) using S4 flowcells to generate 150 base-pair paired-end reads. Sequencing quality was consistent throughout the cohort with insert sizes of ~500 bp and duplicate rates of 8–15%. Average coverage was 15x (50 Gbase) for remission samples, 30x (100 Gbase) for tumor samples and 60x (200 Gbase) for tumor samples with a lower blast percentage. The reference cohort was entirely sequenced by the Princess Maxima Center diagnostic laboratory with a 30x coverage for remission samples and a 90x coverage for tumor samples. Targeted deep resequencing was performed for 32 039 somatic SBSs found in 23 tumors belonging to six patients with a custom Roche panel KAPA HyperChoice MAX 20 Mb (Roche, Basel, Switzerland). Sequencing was performed on a Novaseq 6000 using a S1 flow cell, with >1000X sequencing depth. Following GATK’s best practices [21], the Burrows-Wheeler aligner (BWA) [22] was utilized to map the reads of each sample to the GRCh38 human reference genome. Subsequently, duplicate reads were identified and marked using Picard v2.20.1 [23] and base quality score recalibration was performed with GATK v4.0.1.2. Mutect2 from GATK v4.1.1.0 was used for mutation calling of somatic SBSs, DBSs and indels using all available tumor and normal samples per patient. Mutations were annotated using Variant Effect Predictor (VEP) v105 [24], filtered using a compiled list of ALL drivers [8, 9, 25,26,27] (Table S1) and classified as pathogenic if VEP v105 scored a high or moderate impact or CADD score [28] ≥20.

Clustering somatic SBS

SBSs were clustered using a custom binary grouping algorithm. This method simplified the data by transforming the mutant allele frequency (MAF) of all tumors for each mutation into binary values. Present mutations, with a MAF above 0, were assigned a value of 1, whereas absent mutations with a MAF of 0 were assigned a value of 0. Mutations with the same sequence of binary values were assigned to the same cluster. Next, the MAF distribution was assessed for each time point of all mutation clusters, to determine if multiple (sub)clones were present in a single mutation cluster. If the MAF showed a multimodal distribution, the cluster was split at the minimum between modalities.

Mutational profile analyses

R v4.1.2 was used to generate 96-trinucleotide count matrices and mutational profiles for each cluster consisting of at least 75 mutations using the R package MutationalPatterns v3.4.1 [29]. Next, de novo mutational signature extraction was performed on the combined mutation matrix of 133 clusters from the multiply relapsed cohort, 113 ALL initial diagnosis samples of the reference cohort and four ALL relapse samples sequenced by the Princess Maxima Center diagnostic laboratory, using R v4.1.2 packages MutationalPatterns v3.4.1 and non-negative matrix factorization (NMF) v0.24.0.

More detailed method description is given in the supplementary data.

Results

Mutational load in multiply relapsed ALL

We identified 63 cases of multiply relapsed ALL diagnosed between 1992 and 2018 (Table S2), and selected patients by availability of material and high blast percentage (Tables S3 and S4). We collected and whole genome sequenced 97 tumor samples and 44 remission samples from 29 ALL patients (28 precursor B-ALL and 1 T-ALL) with two (n = 18) or more (n = 11) relapses (Fig. 1, Table S3). At time of diagnosis, the 29 ALL patients had a median mutational load of 1133 (196–6988) somatic SBSs and a median indel load of 110 (12–236) (Fig. 2A, B). In relapse, mutational load was increased compared to initial diagnosis in 28 out of 29 patients and, in 24 of them, further increased in subsequent relapses (Fig. 2). Fourteen out of 29 patients reached mutational loads above the hypermutation threshold of ~1.3 mutations per Mb [7] (Fig. 2D). We did not find a significant correlation between mutational load and time to relapse (linear regression analysis for relapse 1: R = 0.31, p = 0.11; relapse 2: R = 0.13, p = 0.53).

Fig. 1: Timeline of disease progression in multiply relapsed ALL.
figure 1

The y-axis shows patient IDs and color-coded subtypes. The x-axis shows age in years. Dots represent whole genome sequenced samples with initial diagnosis in orange and relapses in blue. Grey crosses represent relapses that were not included for sequencing due to unavailability or low blast percentage. Vertical black bars represent bone marrow stem cell transplantations (SCT).

Fig. 2: Mutational load in 29 multiply relapsed ALL patients.
figure 2

A Box plot of SBS load in multiply relapsed ALL at each tumor time point. B Box plot of indel load in multiply relapsed ALL at each tumor time point. C Bar plot of the percentage of patients affected by a mutational signature, other than SBS1 or SBSA, at each tumor time point. The multiply relapsed ALL cohort is compared to a reference cohort of initial diagnosis ALL samples labeled as “Ref Dx”. D Overview of the number of SBSs at each tumor time point grouped per patient. Time points indicate the initial diagnosis (0) or sequential relapses (1, 2, etc.). Mutational load was based on the mutation clusters present at each tumor time point. Colors in the bars indicate the contribution of mutational signatures identified at each tumor time point (see legend panel). Below each group of samples is the color-coded subtype (see legend panel) and patient ID. The mutational profiles of patients P0122 and P0608 have been published before [41]. Dx initial diagnosis, R relapse.

Clonal dynamics of somatic mutations

To enable a spatial and temporal analysis of mutational processes during disease progression we clustered the somatic SBSs of each patient based on their mutant allele frequency (MAF) at each time point. As these clusters of mutations represent isolated mutational events we can infer clonal dynamics in every patient. In all cases, we defined a cluster of mutations that was present at initial diagnosis and preserved in all subsequent relapse samples, thus representing the mutations in the (pre)malignant founding clone. First relapses were derived from the major (n = 13) or minor (n = 6) clone at initial diagnosis, or were derived from a preleukemic ancestor or a subclone which was not detected at initial diagnosis (n = 8; Fig. S1, Table S5). One first relapse (P0180) contained multiple clones derived from initial diagnosis. Second or third relapses developed from a major clone (n = 16 and n = 5), minor clone (n = 6 and n = 5) or multiple clones (n = 3 and n = 0) of the previous time point (Fig. S1, Table S5). Remarkably, case P0098 shared no mutations between the first and second relapse, other than those already present at initial diagnosis, suggesting that both relapses developed independently from the major clone at initial diagnosis. Clusters of acquired mutations arose in most subsequent relapses and, in 22 cases, in each subsequent relapse included in the study (Fig. S1, Table S5). These findings indicate that a linear progression model applies to the majority of the multiple relapse cases, in which mutation accumulation preceded the outgrowth of new relapses.

To validate the accuracy of our mutation calling and clustering procedure with whole genome sequencing data, we selected six patients on which we performed targeted deep resequencing in all collected samples. Of the successfully targeted somatic SBSs, we validated 99.89% (Table S6). The median depth of over 1000x revealed more accurate MAF estimates with a decreased variability of data points at each time point in each cluster (Fig. S2). Overall, cluster assignment was correct for 97.3% of the SBSs (Table S7). As expected, the accuracy of MAF estimates in these six cases was lower for mutations that appeared subclonal (range 56–90%) or absent (range 86–98%) at a time point prior to clonal expansion (Fig. S3, Table S7). Nevertheless, correction of cluster assignment based on the deep sequencing validations had very limited impact on the overall size and mutational profiles (Fig. S3, Table S8). Therefore, we consider our clustering approach on whole genome sequencing-based MAF estimates valid.

Mutational processes in ALL at diagnosis

To further unravel the processes that were active during disease progression and relapse development in the multiply relapsed ALL patients, we performed mutational signature analysis. Using de novo signature extraction on 133 SBS clusters extracted from the 29 multiply relapsed patients, we identified eight de novo mutational signatures, of which five strongly resembled known SBS signatures found in the curated mutational signatures database by Catalogue Of Somatic Mutations In Cancer (COSMIC) (cosine similarity >0.9) [3]. Two could be explained by a combination of two COSMIC signatures each (SBS2/SBS13 and SBS17a/SBS17b) (Fig. S4) and one de novo signature, SBSA, did not resemble a known COSMIC signature (Table S9). To determine the contribution of mutational processes in our samples, we subsequently applied a strict refit of the nine COSMIC signatures and the de novo signature SBSA to the 133 SBS clusters (Figs. 2D, S1 and S5, Table S10) and an in-house reference cohort of 114 ALL initial diagnosis samples (Table S11).

Six of the ten SBS signatures were detected at time of initial diagnosis in the multiply relapsed cohort and the reference cohort, of which SBS1 (90 and 93%, respectively) and SBSA (86 and 86%, respectively) were present in the majority of cases (Tables S10 and S11). SBS18, associated with ROS DNA damage, was detected across various ALL subtypes in 34 and 27% of the diagnosis samples in the multiply relapsed and reference cohorts, respectively. In line with previous data, UV-like DNA damage (SBS7a) was common in iAMP21-positive and hyperdiploid subtypes, and APOBEC mutagenesis (SBS2 and SBS13) was found to be enriched in the ETV6::RUNX1-rearranged subtype (Tables S10 and S11). In our multiply relapsed ALL cohort UV-like damage was also encountered in two B-other cases, and high loads of APOBEC-associated mutations were detected in a BCR::ABL1-like ALL patient with Down syndrome (P0122). Overall we conclude that the same mutational processes affect patients in the reference cohort (36%) and the cohort of multiply relapsed patients (48%), at similar frequency (Fig. 2).

Recurrency of mutational processes at relapse

We then investigated the timing and continuity of mutational processes in the multiple sequential samples of each patient (Fig. 3A, B). Similar to initial diagnosis, SBS1 and SBSA most often contributed to the increase in mutational load in relapses (49 and 59% of relapse samples, respectively; Table S10). In three of the ten cases with SBS18-associated mutagenesis at initial diagnosis, ROS-associated mutagenesis continued at relapse (Fig. S6). Furthermore, case P0180 acquired SBS18-associated mutations only at second relapse (Fig. S6). Overall, SBS18 contributed with small relative numbers of mutations at both initial diagnosis and relapse and, therefore, only had a minor impact on the elevated mutational load (Table S10).

Fig. 3: Prevalence and timing of mutational processes.
figure 3

A Overview of clusters of mutations detected in case P0121 as a representative example. On the left are plots of the change in mutant allele frequency over time for all individual mutations in each cluster. Mutant allele frequency and the name of each cluster is displayed on the y-axis and the sample time points are on the x-axis. Mutations in a region with a copy number gain (blue) or loss (red) were colored. The middle plots represent the 96-trinucleotide mutational profiles corresponding to each cluster. On the right are barcharts displaying the absolute number of mutations (x-axis) attributed to color-coded mutational signatures in each cluster. Clusters with less than 200 mutations were not assigned mutational signatures and named unassigned. Furthermore, mutations attributed to mutational signatures which were assigned in less than 75% of bootstraps (see methods) were also classified as unassigned. B Fish plot of clonal development in case P0121. Colors correspond to the mutation clusters shown in A. Relative contribution of APOBEC-associated mutations to mutations acquired at each time point is depicted below the fish plot. C Relative contribution of SBS7a-associated mutations to acquired mutations per time point, in all four SBS7a-positive cases in the cohort. Color intensity represents the relative contribution. Broken lines indicate missing samples. D Relative contribution of mutations associated to SBS2 and SBS13 in six cases, depicted as in C.

The UV-associated signature SBS7a, observed in B-other case P0608 and iAMP21 case P0609 at initial diagnosis, was also present in new rising clones at subsequent relapses in both patients. Additionally, B-Other case P0614 and hyperdiploid case P0617 had no UV-like damage at initial diagnosis but acquired UV-like DNA damage in two separate relapses (Figs. 3C and S1). Only in case P0617 we noticed a potential link to UV-light as the third relapse localized to the eye of the patient. These data suggest that iAMP21, hyperdiploid and some B-Other ALL cases are prone to develop UV-like DNA damage at different time points in disease development.

In three cases with APOBEC-associated mutations at initial diagnosis, APOBEC mutagenesis appeared to be absent from (P0066 and P0122) or very subtly present in (P0180) the cluster of preserved full-clonal mutations, suggesting that in these cases this process initiated in a later stage of leukemia development. Two ETV6::RUNX1-rearranged cases (P0121 and P0140) presented with the first acquired APOBEC-associated mutations only at time of first relapse (Fig. 3). All five cases showed evidence of continuing APOBEC mutagenesis in later relapses. B-Other case P0616 acquired APOBEC-associated mutations only in an extramedullary second relapse localized to the chest (Figs. 3D and S1). Together, these findings show that these mutational processes recurrently detected at initial diagnosis, have the potential to fade out, continue or newly arise at relapse.

Treatment-associated mutational processes

We identified three relapse-specific mutational signatures that were previously reported to be induced by treatment-related mutational processes (Fig. 4) [8]. Mutational signature SBS87, associated with thiopurine-induced DNA damage, strongly contributed to newly acquired mutations in relapse, affecting in total 15 of the 29 patients. In line with the extensive use of thiopurines in treatment of initial diagnosis and first relapse, SBS87 presented most often at first (n = 10) and second (n = 9) relapse, but was also observed in three later relapses (Fig. 4C). Strikingly, only seven patients were affected by SBS87 at multiple relapses, despite the ubiquitous use of thiopurines during treatment (Fig. 4C). We found no significant correlation between time in complete remission and the amount of acquired SBS87 mutations (p = 0.8, Fig. 4D).

Fig. 4: Prevalence and timing of treatment-associated mutational processes.
figure 4

A Overview of clusters of mutations detected in case P0619 as a representative example, similar to Fig. 3A. B Fish plot of clonal development in case P0619. Colors correspond to the mutation clusters shown in A. Relative contribution of SBS86, SBS87 and SBS17a/b to the mutations acquired at that time point is shown below the plot. C Relative contribution of SBS87-associated mutations to acquired mutations per time point for 15 cases. Color intensity represents the relative contribution. Broken lines indicate missing samples. D Comparison of the amount of acquired SBS87 mutations (y-axis) and remission time (in months, x-axis). Each dot represents a relapse sample with new SBS87 mutations. A regression line and corresponding R-value, p-value and formula are shown. E Relative contribution of mutations associated to SBS86 in six cases, depicted as in C.

In six patients we found SBS86-associated mutations. Thus far, SBS86 has only been identified in relapsed ALL samples and is therefore likely induced by a component of ALL treatment [8], but its etiology is currently unknown. In three cases, SBS86-associated mutations were encountered in multiple relapses, and in five cases SBS86 was present in combination with SBS87 (Fig. 4E).

In case P0619, with a PAX5alt driving event and mutational signatures SBS1, SBSA and SBS18 at initial diagnosis, we identified three relapse-associated signatures (Fig. 4A). In addition to SBS86 and SBS87, this patient presented with subclonal and clonal SBS17-associated mutations at first and second relapse, respectively (Figs. 4A, B and S1). SBS17 and SBS18 have been associated with ROS, 8-Oxo-dGTP and 8-Oxo-dG respectively [30]. SBS17 has previously been associated with various exposures [30, 31], which are likely all sources of 8-Oxo-dGTP and not the direct cause of DNA damage, making it difficult to pinpoint what process caused an accumulation of 8-Oxo-dGTP in patient P0619. Finally, relapses following stem cell transplantation did not present with structurally different mutational processes than earlier time points (Table S12).

A basal mutational process in ALL

The de novo signature SBSA (Fig. S7A) was found to contribute to the mutational profile of most multiply relapsed cases and initial diagnosis samples (Figs. 2D and S1), and had high cosine similarity with five COSMIC signatures (Fig. S7B) of which SBS5 and the blood-specific SBSBlood have been associated with ALL and hematopoietic cells, respectively [5, 32]. Eleven cases presented with SBSA as the dominant mutational mechanism that remained active at subsequent relapses (Figs. 2D and S1). The hyperdiploid case P0144, with five relapses, was the most striking example with SBSA at initial diagnosis and ~450–1200 newly acquired SBSA mutations in each relapse (Figs. S7C-D). We suspected that SBSA presented with higher mutational loads in older patients. Indeed, SBSA behaved similarly to the known clock-like SBS1 [33], with the mutational load of SBSA positively correlating with age, both in the multiply relapsed patients and the reference cohort (Fig. S7E–G). Moreover, the mutational load of SBSA in relapse samples positively correlated with length of remission, similar to the known clock-like SBS1 (Fig. S7H). These findings indicate that SBSA represents a clock-like mutational process introducing background mutations in ALL samples, similar to SBS5 and SBSBlood in other studies [5, 32].

Driver gene analysis

Next, we investigated whether the pronounced activity of mutational processes in ALL with multiple relapses contributed to disease development by introducing relapse-driving mutations. We identified a total of 61 (predicted) pathogenic mutations affecting known ALL driver genes in 26 out of 29 patients (Table S13). Most common were known hotspot RAS pathway mutations in KRAS (n = 5), NRAS (n = 4) and PTPN11 (n = 4; Table S13) of which eleven were acquired or became clonal at relapse. Additionally, we found mutations in various genes known to drive relapse by therapy evasion, including CREBBP (n = 4), TP53 (n = 3), NR3C1 (n = 2) and NT5C2 (n = 1) (Table S13). Seven of these relapse driving mutations were acquired at relapse, while three CREBBP mutations were present already at initial diagnosis.

We then calculated the a posteriori probability that each potential driver mutation was caused by a mutational signature. We found that 30 of the 61 driver mutations were a consequence of a mutational process other than SBS1 or SBSA (Table S13). Interestingly, of eleven mutations attributable to APOBEC mutagenesis or UV-like DNA damage, processes which can be active both at diagnosis and relapse, eight were acquired at relapse. Of these eleven mutations only a KRAS mutation in case P0122, likely caused by SBS13, was found at a known mutational hotspot position (Fig. S8). Nevertheless, the other ten mutations may still have contributed to disease progression. For example, P0608 acquired two not previously described NR3C1 mutations, likely caused by SBS7a, of which one introduced a stop codon, which may have caused glucocorticoid resistance (Fig. S8, Table S13). Four mutations were attributed to SBS86, of which a PTPN11 mutation matched both the trinucleotide context and was in a known mutational hotspot (Fig. 5A). Twelve mutations were attributed to SBS87, including two TP53 mutations, two NRAS and one NT5C2 mutation in known mutational hotspots (Fig. 5B). Consequently, the activity of intrinsic and treatment-related mutational processes caused (likely) pathogenic driver mutations in multiply relapsed ALL patients that may have contributed to relapse development.

Fig. 5: Potentially pathogenic mutations caused by SBS86 and SBS87.
figure 5

96-trinucleotide contexts of COSMIC mutational signatures SBS86 (A) and SBS87 (B) with potentially pathogenic mutations, which were likely caused by that mutational signature according to a posteriori probability analysis. For each mutation, the patient, gene and mutated amino acid are indicated. Bar plot on the right depicts the probability score for each mutational signature for the identified mutations. Known hotspot mutations (black) and predicted pathogenic mutations (grey) that match the most prominent trinucleotide contexts of the signature (3% contribution or more) are depicted.

Discussion

In this study, we comprehensively investigated the mutational processes acting during clonal evolution in multiply relapsed pediatric ALL patients. A total of ten mutational processes were identified, which were recurrently and temporarily active, including both intrinsically driven and treatment-associated processes. The analysis of clone-specific mutation clusters revealed in-depth insights into the clonal origin and spatiotemporal activity of these mutational processes. In this way, we were able to detect mutational activity in subclones that would have stayed unnoticed in a sample-based analysis. For example, in case P0609 we identified a small cluster of SBS87-associated mutations unique to the second relapse, in addition to an overwhelming load of SBS7a-associated mutations. Furthermore, the mutation clusters enabled a more accurate timing of mutational events, like in case P0619 where a subclonal population with a high contribution of SBS17a and SBS17b was identified at first relapse, which would otherwise have been attributed solely to the second relapse. Thus, the analysis of mutational processes in clusters of mutations with similar dynamics in allelic burden across multiple samples per patient facilitates the detection and timing of these processes with greater accuracy.

All mutational processes detected in this study showed signs of continued activity over longer time periods during tumor evolution or, at least, in different clones. This continuity is to be expected for clock-like signatures like SBSA and SBS1, which by definition accumulate over time, but less obvious for the other mutational processes. SBS7a, which is widely reported to be a result of UV-induced DNA damage [3, 34], affected four patients in our study, all at multiple time points. We could only identify a possibly UV-exposed lesion in patient P0617 who suffered an eye relapse. The site of origin of SBS7a-associated damage in the other patients may be the skin, but remains uncertain. Our data pose APOBEC mutagenesis as a prominent intrinsic secondary driver with recurrent activity in the ETV6::RUNX1 subtype. Nevertheless, APOBEC mutagenesis did not occur at every time point in affected patients and also did not affect all ETV6::RUNX1 patients (Fig. 2) [4, 5, 8], suggesting that additional triggers are involved in initiating APOBEC mutagenesis. Finally, the common but subtle contribution of ROS-associated SBS18 is in line with the prevalence of SBS18 and deregulation of ROS in cancer in general [3, 35]. Additionally, a number of drugs used in ALL treatment, like cytarabine and daunorubicin, harm cells by upregulation of ROS [36]. This repeated exposure to high levels of ROS could be another source of SBS18 and, perhaps, SBS17a and SBS17b in relapsed ALL samples. We conclude that mutational processes in ALL relapses are highly continuous and frequently show subtype specificity likely upon additional, yet unknown, intrinsic and/or external triggers.

We identified SBSA, a clock-like mutational signature characterized by a very broad spectrum of mutations, which represents a common mutational process in ALL patients. SBSA shows similarity to several known mutational signatures, including the clock-like signatures SBS5 and SBSBlood, but appears to be different. Since it has been shown that mutational signatures, including SBS5, can vary in presentation depending on tissue of origin [37], it is possible that SBSA should be seen as a leukemic or hematopoietic presentation of SBS5. SBS5 has often been attributed to leukemic samples [4, 5, 7, 8], but our approach of de novo mutational signature extraction using (sub)clone-specific clusters of mutations and the high load of SBSA in several mutation clusters may have enabled a more accurate, tissue-specific definition of this rather indistinctive SBS signature in leukemia. As such, we consider SBSA to represent a blood or leukemia specific presentation of SBS5.

We noted a high prevalence of treatment-related mutagenesis across the different ALL subtypes within our cohort, with treatment-related mutational signatures SBS86 and/or SBS87 affecting over half of the studied patients in one or more relapses, and potentially treatment-related mutational signatures SBS17a and SBS17b affecting one patient in both relapses. We found no subtype specificity or other shared genetic alterations that explained the accumulation of SBS86 and SBS87. The presence of SBS87 illustrates that thiopurines entered leukemic cells and caused DNA damage, but were unable to kill them. This could be the result of intrinsic resistance to thiopurines, e.g. by acquired mutations in NT5C2 or PRPS1 [38, 39]. Indeed, SBS87 may have induced an NT5C2 hotspot mutation in one of the patients in our cohort (P0058), as well as several other relapse-associated mutations. However, considering the wide prevalence of SBS87 in our and other studies [8, 11], additional factors play a role, including resistance to other components of the therapy regimen, local escape mechanisms, temporal senescence or insufficient exposure to thiopurine because of patient-specific variability in thiopurine metabolism. The latter scenario could be investigated in future studies by associating SBS87 prevalence with the concentration of DNA-incorporated thioguanine nucleotides in follow-up blood samples, which has been reported to correlate with incidence of relapse [40].

In conclusion, we comprehensively characterized mutational processes in a cohort of Dutch pediatric multiply relapsed ALL patients and observed increased activity of a variety of intrinsic and treatment-associated mutational mechanisms. These mutational processes can affect ALL patients from the time of initial diagnosis, but can also arise, fade or re-appear in (sub)clones during disease progression. Furthermore, treatment-related mutational processes caused relapse-driving mutations in multiple patients, thereby likely supporting relapse development. The relevance of identifying and monitoring the activity of mutational processes in ALL requires further studies, but minimizing treatment-related DNA damage promises to be beneficial for these patients.