Introduction

Uveal melanoma (UM) is the most common primary intraocular tumor in adults affecting ~2-8 individuals per million and per year1. Metastatic disease eventually develops in approximately 30 to 50% of the patients, mainly in the liver with a median progression-free survival (PFS) and overall survival (OS) of ~3–4 months and ~10–16 months, respectively2. In case of liver metastases, no systemic therapy option had demonstrated an OS benefit until the recent approval of tebentafusp. Tebentafusp is a bispecific protein consisting of an affinity-enhanced, HLA- A*02:01-restricted, T-cell receptor specific for a gp100-derived peptide fused to an anti-CD3 single-chain variable fragment that can recruit and activate polyclonal T-cells. Although tebentafusp showed an OS benefit in IMCgp100-202, the randomized phase 3 trial (NCT03070392)3, its efficacy in the real-world setting remains elusive. Importantly, tumor responses according to RECIST criteria4 were observed in only a fraction of patients ( ~ 10%) and discrepancies between radiographic response and OS were reported3,5. Therefore, easily accessible, sensitive and specific methods allowing a better monitoring of individuals with metastatic UM (MUM) treated with tebentafusp in real-life are needed.

Circulating tumor DNA (ctDNA), a portion of total circulating cell-free DNA (ccfDNA), has become a pivotal biomarker for diagnosing, predicting outcomes, and monitoring treatment efficacy in various cancer types6,7. UM provides an interesting model for testing ctDNA detection, due to the almost consistent presence of clonal, hotspot mutations, predominantly involving GNAQ, GNA11, SF3B1 or PLCB4 and CYSLTR2 to a lesser extent1. Prior evidence from our group and others demonstrated the detectability of ctDNA in individuals affected with MUM using either droplet digital PCR (ddPCR) or targeted next-generation sequencing (NGS)8,9,10,11,12. Both approaches have high sensitivity and specificity, each with its own advantages and limitations in detecting mutations. ddPCR requires to know the tumor mutational status before ctDNA detection to use the specific primers of the targeted mutations, but is cheaper, with faster turnaround time and independently of bioinformatics analyzes. In contrast, targeted NGS is agnostic and allows a larger panel of detected mutations, possibly with a global copy number profile. Therefore, the choice between these approaches will depend on the objectives of the study, the availability of the respective technologies, and the expected mutations in the tumor.

Exploratory analyzes from the phase 2 IMCgp100-102 and phase 3 IMCgp100-202 tebentafusp clinical trials3,5 suggest that detecting circulating tumor DNA (ctDNA) with a multiplex PCR-based targeted NGS panel in these trials, holds promise as an early predictive indicator of tebentafusp clinical efficacy. Significantly, the dynamics of ctDNA monitoring appeared particularly relevant in patients with progressive disease (PD) at first assessment as the patients with PD and paradoxical ctDNA decrease derived a clinical benefit from tebentafusp compared to those receiving the standard treatment13.

In this work, we report a prospective evaluation of ctDNA using ddPCR, an alternative cost-effective approach, in MUM patients treated with tebentafusp, showing the prognostic value of ctDNA at baseline, and the association between early ctDNA reduction and OS.

Results

Clinical characteristics of the patients

Sixty-nine individuals, diagnosed with MUM and eligible for tebentafusp treatment, were enrolled from January 2019 to December 2022 in this study. Median age was 59 years (range [51–66]) and 58% were women. Among the participants, 64% (44 patients) had not received any prior lines of therapy, and 86% (59 patients) presented solely with liver metastases at the initiation of tebentafusp therapy. Of 69 patients, 29 (42%) had their largest metastasis diameter superior to three centimeters (i.e. M1b/M1c according the TNM classification)14. Thirty-one patients out of 68 (46%) had elevated baseline serum lactate dehydrogenase (LDH). Genomic profiling revealed monosomy 3 in 83% of the patients with available data (57/69). Patient characteristics are detailed in Table 1 and Supplementary Table 1. ctDNA positivity was associated with number of previous lines of therapy, baseline serum LDH levels, diameter of the largest metastasis and best tumor response to tebentafusp based on imaging. Interestingly, all patients with partial response according to modified RECIST criteria had undetectable ctDNA at baseline (Table 1).

Table 1 Clinical characteristics of the patients according to ctDNA positivity at baseline

Patients’ outcome

With a median follow-up of 20.0 months (interquartile range [IQR]: 15.2–23.2 months), disease progression was observed in 52 patients (75%) and deaths in 32 patients (46%). Metastatic UM was the cause of all deaths. The median PFS and OS achieved with tebentafusp were 2.8 months (95%CI [2.6–10.5]) and 21.8 months (95%CI [18.5-Not reached; NR]), respectively (Fig. 1A). The objective response rate (ORR) was 10% (n = 7/68), all being partial responses (PR), one patient was not evaluable because lesions were too small. Notably, 34% patients (n = 23) achieved stable disease (SD) as best overall response, while 56% (n = 38) experienced progressive disease (PD).

Fig. 1: Progression-free survival (PFS) and overall survival (OS) in the whole cohort (A) and workflow (B).
figure 1

Patients in the PFS curve are censored at the date of last news if not in progression, while patients in the OS curve are censored at the date of last news if alive. NR: not reached; UM: uveal melanoma; 3w: 3 weeks; 12w: 12 weeks. Source data are provided as a Source Data file.

Identification of trackable somatic mutations

Of the 69 patients, archival tumor tissue from either the primary tumor or resected liver metastases was available for 67 individuals (97%). For the remaining two patients with no tumor tissue available for analysis, plasma ccfDNA at baseline served as the source for identifying somatic mutations. Ultimately, 97% of the cohort (67 out of 69 patients) harbored at least one trackable somatic mutation, in GNAQ (n = 37), GNA11 (n = 29) or SF3B1 (n = 1). Subsequently, these patients underwent assessment for ctDNA levels at baseline before tebentafusp, at 3 weeks (3w), 12 weeks (12w) and at progression, targeting the previously identified mutations.

Prognostic value of ctDNA detection at baseline, 3 weeks, 12 weeks

Among the 67 patients with trackable mutation, plasma samples were available for 65 at baseline and 64 were evaluable (one patient had less than 300 amplifiable genomes with no ctDNA detected; Fig. 1B). ctDNA was detectable in 39 out of 64 patients (61%) with a median of 31 copies/mL of plasma (IQR = 9-557 copies/mL of plasma; Fig. 2A). As tumor burden influences ctDNA levels15, ctDNA levels were positively correlated with serum LDH levels (Fig. 2B) and diameter of the largest metastasis (Fig. 2C). Notably, four patients with M1a disease (largest metastasis of less than three centimeters) paradoxically presented with high levels of ctDNA ( > 100 copies/mL of plasma). Three of these four patients presented with innumerable, small liver metastases, including one individual who also had countless bone metastases scattered throughout the spine (ctDNA: 9443 copies/mL of plasma), demonstrating that TNM does not perfectly reflect tumor burden. Conversely, the fourth patient (ctDNA:112 copies/mL of plasma) exhibited only one visible liver metastasis, which was not extensively necrotic, indicating the potential presence of undetectable metastases elsewhere. In contrast, two patients with M1b disease were ctDNA negative at baseline. The remaining 23 ctDNA negative cases were M1a, as expected. All M1c cases were ctDNA positive. Interestingly, all patients with tumor response to tebentafusp had no ctDNA detected at baseline (7/25; 28% of ctDNA negative patients; Table 1).

Fig. 2: ctDNA detection before and on treatment.
figure 2

A levels of ctDNA detection at baseline in patients with ctDNA detected (n = 39, median = 31.0, IQR [9.0-557.0]). Errors bars are presented as median +/- IQR. B, C correlation between baseline ctDNA levels with serum LDH level (Pearson coefficient; (B); n = 63, exact p-value:3.10−8) and with diameter of the largest metastasis according to TNM (Wilcoxon test; (C); M1a: n = 36; M1b: n = 22 and M1c: n = 6). Error bars are presented as median +/- IQR. M1a: median = 0.1, IQR [0.1-9.0]; M1b: median = 18.0, IQR [8.5-499.3]; M1c: median = 431.0, IQR [146.8-1570]. Two-sided statistical tests were used. Exact p-value for comparison between M1a and M1b is 9.6.10−5. D Sankey plot showing the dynamic changes of ctDNA detection in patients under tebentafusp. E swimmer plot showing the dynamic changes of ctDNA levels in each patient. Top panel E presents the patients with ctDNA detected at baseline while bottom panel presents patients without ctDNA detected at baseline or patients with no available ctDNA analysis at baseline. Symbols in black and white represent the samples with positive and negative ctDNA detection. Red, yellow and green bars represent the duration of treatment (with colors representing the best response in each patient) while gray bars represent the follow-up after tebentafusp discontinuation. ND: not detected; LDH: lactate dehydrogenase; MAF: mutant allele frequency; NA: not available; TNM: tumor node metastasis classification. Source data are provided as a Source Data file.

By three weeks, 31 out of 64 patients (48%) tested positive for ctDNA (Fig. 2D). Of the 61 patients with paired plasma samples at baseline and 3w, 22 (36%) showed no ctDNA detection at either time point. Among the 36 patients with baseline ctDNA presence, 24 (67%) displayed reduced ctDNA levels at 3 weeks, with eight achieving complete ctDNA clearance. At the 1st tumor assessment (12w), 21 out of 56 patients (38%) had detectable ctDNA (median timing for ctDNA sampling = 12.0 weeks [IQR: 11.9-12.6]). Among the 54 paired samples (baseline and 12w), 22 patients (41%) showed no detectable ctDNA at both time points, including 19 who had also no detectable ctDNA at 3w (Fig. 2E). For patients initially exhibiting ctDNA, 22/31 (71%) experienced decreased ctDNA levels at 12 weeks, including 11 achieving complete ctDNA clearance. Seventeen patients had available plasma samples at disease progression, with 14 testing positive for ctDNA (Fig. 2E).

Patients with detectable ctDNA before treatment displayed significantly shorter PFS (median 2.5 months versus 10.8 months; HR = 2.8, 95% CI [1.5-5.2], p = 0.001) and OS (median 12.9 months versus 40.5 months; HR = 10.1, 95% CI [3.0-33.8]; p < 0.001; Fig. 3A). The observed difference in OS between the two groups was not attributed to treatments administered after the interruption of tebentafusp, as the distribution of post-tebentafusp therapies, including PD1 inhibitors, was balanced across groups regardless of ctDNA detection at baseline (Supplementary Table 2). Furthermore, ctDNA detection at 3w and 12w were correlated also with poorer prognoses in both PFS (p < 0.001 for both 3w and 12w) and OS (p < 0.001 both; Supplementary Fig. 1). Finally, baseline LDH and tumor burden according to TNM were also associated with prognosis (Fig. 3B and C).

Fig. 3: Prognostic values of ctDNA, LDH and TNM classification.
figure 3

Prognostic values of markers on PFS (left panels) and OS (right panels) at baseline. A Prognostic value of ctDNA detection. Exact p-value for OS: 7.10−6. B Prognostic value of LDH. Exact p-value for OS: 10−5. C Prognostic value of TNM. Exact p-value for OS: 7.10-4. N: normal range; H: higher than normal range; PFS: Progression Free Survival; OS: Overall Survival. Kaplan–Meier plots were used for survival analysis, with significance assessed using the two-sided log-rank test. Source data are provided as a Source Data file.

ctDNA dynamics as a pharmacodynamic marker

Among the 19 patients with no ctDNA detection at baseline, 3w and 12w, 11 (58%) had SD and eight (42%) had PD at the first tumor assessment. Fourteen of the 22 patients with ctDNA decrease between baseline and 12w, had PD at first assessment (64%; Supplementary Fig. 2A). More precisely, among the 11 patients achieving complete ctDNA clearance at 12w, five (45%) had SD and six had PD (55%). A smaller proportion of patients had already cleared their ctDNA at three weeks (three (n = 3/8, 37.5%) with SD, five (n = 5/8, 62.5%) with PD; Supplementary Fig. 2B). Using less stringent thresholds, 90% and 50% of ctDNA decrease, for defining ctDNA decrease at 12w revealed a higher number of patients classified as PD while exhibiting a reduction in ctDNA (n = 7/13, 54 % for 90% decrease and n = 9/16, 56% for 50% decrease, respectively). Notably, among the 22 patients with PD and ctDNA variation in paired samples, nine displayed a minimum of 50% reduction in ctDNA levels, with six achieving complete ctDNA clearance.

Regarding survival outcomes, ctDNA-positive patients at baseline who experienced complete clearance at 12 weeks (n = 11) had a significantly longer median OS (12.7 months versus 34.6 months; HR = 7.1, 95%CI [1.6-31.4], p = 0.003) and PFS (2.6 months versus 17.1 months; HR = 2.9, 95%CI [1.1-8.0], p = 0.03) compared to those without clearance (n = 21; Fig. 4A and Supplementary Fig. 3A). Similar trends were observed with different thresholds. Patients achieving a 90% ctDNA decrease at 12w (n = 13) versus those who did not (n = 19) had a longer median OS (12.9 months versus 21.2 months; HR = 3.4, 95%CI [1.1-10.4], p = 0.02) and longer median PFS (2.6 months versus 10.6 months; HR = 2.6, 95%CI [1.1-6.5], p = 0.03). When a 50% threshold at 12w was applied, a numerically longer OS (median 12.9 months for the 16 patients with 50% decrease versus 21.2 months for the 16 others, p = 0.07) and PFS (2.7 months versus 6.7 months; p = 0.09) were observed though not statistically significant. No significant result was observed between baseline and 3w (Fig. 4B and Supplementary Fig. 3B). Importantly, among patients with PD, clearance of ctDNA at 12w (n = 6) demonstrated a longer OS compared to patients without ctDNA clearance (n = 16; median OS of 12.5 versus 21.2 months; HR = 6.3, 95%CI [1.3-27.8], p = 0.009), which was comparable to those who tested negative for ctDNA at baseline (Fig. 4C). It is worth noting that only a limited number of patients was available for this analysis.

Fig. 4: Predictive value of ctDNA variations on tebentafusp.
figure 4

According to variation at 12 weeks (A) 3 weeks (B) and 12 weeks in patients with progressive disease at first assessment (C). OS: overall survival. Clearance of ctDNA is defined as the absence of detectable ctDNA at 3 weeks or 12 weeks after treatment, following an initial detection of ctDNA at baseline. Kaplan–Meier plots were used for survival analysis, with significance assessed using the two-sided log-rank test. Source data are provided as a Source Data file.

Univariate and multivariate analyzes

Univariate and multivariate proportional hazards models were performed in the cases with matched samples at baseline and 12w (n = 54) to assess the prognostic significance of clinicopathological factors, in combination with ctDNA detection at different time points and ctDNA variation. As shown in Supplementary Table 3, ctDNA detection at baseline, 3w or 12w, and ctDNA clearance between baseline and 12w were significantly associated with PFS and OS in univariate analyzes, while prior number of lines of therapy, elevated baseline serum LDH and diameter of the largest metastasis according to TNM were only associated with OS.

In multivariate analyzes, two models were constructed for PFS and OS analyzes. In Model 1 (which integrates a “detection of ctDNA at baseline” variable), baseline ctDNA positivity emerged as an independent prognostic factor for OS and PFS (for OS: HR = 9.3, 95%CI [1.9–45.2], p = 0.006 and for PFS: HR = 2.5, 95%CI [1.1-5.9], p = 0.04, respectively) while elevated LDH level was only associated with OS (HR = 4.4, 95%CI [1.5-13.4], p = 0.008), as detailed in Table 2. In Model 2, patients with ctDNA positivity at baseline who either cleared ctDNA by 12 weeks or did not were compared against patients who were ctDNA-negative both at baseline and at 12 weeks (reference group). This comparison aimed to evaluate if achieving ctDNA clearance by 12 weeks translated to similar PFS and OS outcomes as being ctDNA-negative from the start, and how these outcomes compared to patients with persistent ctDNA positivity. No statistically significant difference in PFS or OS was found between patients with ctDNA clearance at 12 weeks and the reference group (HR = 1.7, 95% CI [0.6-4.8], p = 0.3 and HR = 6.8, 95% CI [0.7-69.2], p = 0.1, respectively). In contrast, patients with persistent ctDNA positivity at 12 weeks had significantly worse PFS and OS compared to the reference group (HR = 5.8, 95% CI [2.1-16.5], p = 0.0009 and HR = 49.6, 95% CI [4.5-544.5], p = 0.001, respectively).

Table 2 Multivariate analysis for PFS and OS

Additionally, changing the reference group in model 2 for patients who were ctDNA-positive at baseline and remained positive at 12 weeks, those who did not achieve ctDNA clearance at 12 weeks had significantly worse PFS and OS than those who achieved ctDNA clearance (HR = 3.5, 95%CI [1.3-9.1], p = 0.01 and HR = 7.3, 95%CI [1.4-37.1], p = 0.02).

Discussion

Tebentafusp was the first systemic treatment to demonstrate a meaningful OS benefit in metastatic UM patients. With an ORR of 10% and a median OS of 22 months, our study’s outcomes closely mirror those reported in the IMCgp100-202 phase 3 trial3, which demonstrated an ORR of 11% and an OS of 22 months. These consistent findings across more diverse patient populations and different treatment lines underscore the drug’s relevance and effectiveness in clinical practice.

Our main goal was the evaluation of ctDNA detection using ddPCR in MUM patients treated with tebentafusp. In this series, 61% of MUM patients exhibited detectable ctDNA in plasma prior to tebentafusp treatment initiation. Notably, our findings revealed the same rate of detectable ctDNA compared to the IMCgp100-202 trial, in which ctDNA was assessed with a dedicated multiplex PCR-based NGS panel. However, an essential distinction lies in the patient selection criteria between our series and the clinical trial setting. In our study, we treated patients with different tumor burdens, often including those with the lowest tumor burdens in our daily practice, in contrast to the trial’s inclusion criteria requiring RECIST-measurable lesions. However, the tumor burden in our series resembled that of the IMCgp100-202 trial, with 42% of patients showing a largest metastatic lesion exceeding 3 cm (compared to 42% in the trial) and 46% exhibiting LDH levels surpassing the normal range (compared to 30% of patients evaluable for ctDNA in IMCgp100-202). These parallels imply that ddPCR demonstrates comparable sensitivity to the multiplex PCR-based NGS panel utilized in the trial specifically designed to detect seven hotspot mutations at very low mutant allelic frequency (MAF). It is noteworthy that the sensitivity of NGS panels is inherently tied to the size of the captured or amplified regions, meaning that larger panels, including whole-exome, may offer reduced sensitivity in detecting ctDNA.

Similar to high LDH levels, detection of ctDNA at baseline was associated with a poor median OS, emphasizing its potential as a prognostic marker. ctDNA levels were positively associated with tumor burden as appraised by TNM in most cases. However, discrepancies were observed in some cases (i.e. high levels with limited tumor burden or low levels with bulky lesions) suggesting that ctDNA assay may capture biological factors influencing it beyond tumor burden, such as proliferation and necrosis. An important observation in this work is that not only was the detection of ctDNA prognostic, but its presence also appeared to be predictive of the response to tebentafusp as all responders tested negative for ctDNA negative at baseline. More precisely, 28% of patients with no ctDNA detection at baseline presented a tumor response. Moreover, our study corroborates existing evidence linking ctDNA reduction during tebentafusp treatment with improved clinical outcomes. ctDNA reduction on other immune therapies such as checkpoint blockade has already been associated with better clinical outcomes in multiple studies16,17,18. Similarly, ctDNA clearance at nine weeks of tebentafusp in IMCgp100-202 using multiplex PCR-based NGS was associated with better OS (median OS from week nine, 29.6 months versus 10.2 months)3. In our study, a substantial reduction by 90% in ctDNA levels at 12 weeks also showed a significant association with increased OS. Lower threshold (i.e. 50%) or earlier ctDNA clearance (i.e. 3 weeks) were associated with lower predictive values in our series as well as in the phase 3 trial, highlighting the importance of standardizing monitoring not only time points but also thresholds for accurate comparisons. Further studies with more frequent longitudinal blood sampling during the first cycles might be of interest to clarify the best time points of ctDNA sampling. Notably, while traditional radiographic response criteria, such as RECIST, may underestimate the benefit from tebentafusp3,5, our observations support the use of ctDNA clearance at 12 weeks as a predictive biomarker, particularly in patients with progressive disease. Similarly, clearance of ctDNA at 12 weeks demonstrated a longer OS in patients with PD, comparable to that of patients tested negative for ctDNA at baseline. However, this needs to be confirmed in independent series.

In our study, ddPCR compared favorably with targeted NGS in detecting ctDNA. Indeed, with the development of multiplex PCR-based targeted NGS, detecting ctDNA in plasma at very low frequency is now reachable, as shown in the IMCgp100 trials. Furthermore, this method has the advantage of being mutation-agnostic, which can be crucial in tumor types with heterogeneous drivers. However, UM is characterized by specific and frequent hotspot mutations, notably GNAQ and GNA11 mutated in 85% of UM cases. Therefore, UM is a most favorable tumor type for ctDNA detection via ddPCR. Our study highlights the feasibility of the following pipeline identifying first somatic mutations through targeted tumor sequencing, followed then by highly sensitive ctDNA detection using ddPCR. Compared to multiplex PCR-based NGS, this procedure is easily implementable, cost-effective, with a short turnaround time, making this method suitable for clinical applications in most hospitals. Importantly, further developments of multiplex ddPCR assays encompassing the main UM hotspot mutations should allow a mutation-agnostic approach in the next future. We thus believe that ddPCR should be considered as the method of choice when implementing routine ctDNA for MUM patients.

Intriguingly, although the limited number of patients and high proportion of monosomy 3 in our cohort prevent any definitive conclusion, this genomic alteration was not associated with prognosis in this cohort of metastatic patients. While monosomy 3 in primary tumors is consistently associated with a shorter time to first metastasis and subsequent shorter OS1, its impact on survival in the metastatic setting is less clear. Moreover, given that monosomy 3 UM has been linked to higher immune infiltrate1, it raises the possibility that specific immunological features associated with monosomy 3 might sensitize these tumors to tebentafusp. In fact, six out of the seven patients with partial response had monosomy 3, while the genomic profile was unavailable for the last patient. Again, the fact that 83% of the whole cohort was monosomy 3 prevents any conclusion. This underscores the importance of further investigations into the interplay between biological markers and treatment response in this population.

In conclusion, our data reinforces the role of tebentafusp in patients with MUM and advocates for the use of ddPCR-based ctDNA monitoring as a practical, cost-effective approach directly applicable in routine. These findings offer a potential avenue for treatment de-escalation (limited course of tebentafusp instead of pursuing it until progression) or intensification (combination studies) based on ctDNA dynamics, paving the way for personalized therapeutic strategies in the management of UM.

Methods

Patients

The study was performed in accordance with the ethical principles of the Declaration of Helsinki, the International Conference on Harmonization/Good Clinical Practice guidelines, and the Public Health Code in France. All patients enrolled in ALCINA study (NCT02866149) provided written informed consent. The ALCINA study received a favorable opinion from the French national ethics committee “Comité de Protection des Personnes Ile-De-France VI”. All patients included were HLA-A*02:01-positive and were treated with tebentafusp. The treatment regimen comprised intravenous administration of tebentafusp at an initial dose of 20 μg in week 1, followed by 30 μg in week 2, and a subsequent weekly dosage of 68 μg. Tumor response was assessed by mandatory liver MRI at 12-week intervals interpreted according to modified RECIST v1.1 criteria (target lesions ≥5 millimeters) and RECIST v1.1 criteria4 on thoraco-abdomino-pelvic CT scan, categorizing responses as complete response (CR), partial response (PR), stable disease (SD), or progressive disease (PD) by UM expert radiologists. Tebentafusp could be pursued beyond radiographic progression in asymptomatic patients with non-threatening PD at first assessment (12w) according to irRECIST19 with an early new assessment four to eight weeks after, as done in the IMCgp100-202 study. LDH measurement was conducted as part of routine clinical practice in external laboratories, and as such, the specific method employed varied between facilities. Chromosome 3 assessment was also conducted as part of routine clinical practice. Its status was retrieved from testing on metastases whenever possible, or by default in primary tumors when available.

Blood sampling and plasma sample preparation, storage, DNA extraction and quantification

Blood samples were obtained longitudinally at specific intervals: prior to treatment initiation, cycle 2 (after 3 weeks of treatment; 3w), at the first tumor assessment (cycle 5, after 12 weeks of treatment; 12w), and at the time of disease progression. EDTA tubes were used for blood collection. Following blood collection, EDTA tubes were promptly transported at room temperature to the local laboratory for plasma isolation. Within 2 hours, plasma was isolated through a two-step centrifugation process: 820 g for 10 minutes, followed by 16,000 g for 10 minutes at 4 °C and then stored at −80 °C until use. DNA extraction from plasma samples was performed using the QIAmp Circulating Nucleic Acid Kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions. The quantification of extracted DNA was conducted using the dsDNA HS Assay on a Qubit 2.0 fluorometer (Invitrogen, Carlsbad, CA, USA). Extracted DNA samples were stored at −20 °C until further analysis.

Mutation detection with the use of ddPCR assays

Sequencing was conducted using an in-house NGS panel targeting GNAQ, GNA11 and SF3B1 on archived tumor DNA (primary tumor or metastases) but for two cases with no tumor DNA available, for whom plasma ccfDNA at baseline was used. Subsequently, the identified mutations in GNAQ, GNA11, or SF3B1 were employed for ctDNA monitoring in plasma using droplet digital PCR (ddPCR). BAP1 mutations were not selected as they are distributed throughout the gene without a specific hotspot, making them suboptimal for ctDNA follow-up using ddPCR. First, to validate our assays, DNA from cell lines including OMM2.3 (GNAQ p.Q209P c.626 A > C), MP46 (GNAQ p.Q209L c.626 A > T), OMM1 (GNA11 p.Q209L c.626 A > T)20,21,22, and tumor tissues carrying mutations such as GNAQ p.Q209R c.626 A > G, GNAQ p.R183Q c.548 G > A, GNA11 p.R183C c.547 C > T, SF3B1 p.R625C c.1873C>T, GNAQ p.Q209L c.625_626delCAinsTT, and GNAQ p.Q209H c.627 A > C were used as positive controls. Genomic DNA from healthy donors procured from Promega (Madison, WI, USA) served as negative controls. Assays for detecting GNAQ p.Q209L c.626 A > T, GNA11 p.Q209L c.626 A > T, GNAQ p.Q209R c.626 A > G, GNAQ p.R183Q c.548 G > A, and GNA11 p.R183C c.547 C > T were procured from Bio-Rad laboratories (Hercules, CA, USA). For the detection of GNAQ p.Q209P c.626 A > C, GNAQ p.Q209L c.625_626delCAinsTT, GNAQ p.Q209H c.627 A > C and SF3B1 p.R625C c.1873 C > T mutations, specific primers and probes are detailed in Supplementary Table 4 and Supplementary Table 5.

The ddPCR assays were performed using the BioRad QX100 system following standardized protocols. The reaction conditions involved the use of specific primer and probe concentrations, with one probe targeting the wild-type sequence and the other targeting the mutant sequence, each labeled with different fluorophores. Amplification conditions included an initial cycle at 95 °C for 10 minutes, followed by 40 cycles at 94 °C for 30 seconds and hybridization at mutation-specific temperatures for 60 seconds, concluding with a cycle at 98 °C for 10 minutes. Cluster thresholding and quantification were executed using QuantaSoft v.1.7.4 software. Droplets were manually classified as wild-type (WT) or mutant (MUT) based on their fluorescence amplitude: WT (VIC+ or HEX + ) and MUT (FAM + ). The mutant allelic frequencies (MAFs) were calculated as the copy numbers of MUT divided by the sum of the copy numbers of WT and MUT, determining the total copy number of ccfDNA in each patient. For patients lacking somatic mutation status, custom ddPCR assays were employed to detect somatic mutations in either tumor DNA or plasma ccfDNA at baseline. All experiments adhered to the minimum requirements for digital PCR data23,24.

ddPCR data analysis

The false-positive rate for each assay was estimated using 19 replicates of wild-type (WT) DNA, as previously described25,26. The limit of blank (LOB), defined as the upper 95% confidence limit of mean false-positive measurements, was determined as follows: 0.012% for GNAQ p.Q209P c.626 A > C, 0.006% for GNAQ p.Q209L c.626 A > T, 0.005% for GNAQ p.Q209R c.626 A > G, 0.008% for GNAQ p.R183Q c.548 G > A, 0.018% for GNA11 p.R183C c.547 C > T, 0.00% for GNA11 p.Q209L c.626 A > T, 0.02% for GNAQ p.Q209L c.625_626delCAinsTT, GNAQ p.Q209H c.627 A > C, and 0.04% for SF3B1 p.R625C c.1873C> T. Analytical sensitivity was determined using serial dilutions of mutation-carrying DNA in WT DNA (total input DNA: 15 ng), with MAFs ranging from 0.50 to 0.01%. The limit of detection (LOD) was defined as the lowest mutant ratio reliably distinguishable from the LOB value. Samples were considered positive when the MAF exceeded the LOD, with specific positive droplet counts required per mutation type: more than 3 positive droplets for SF3B1 p.R625C c.1873C > T, 2 positive droplets for GNAQ p.Q209L c.625_626delCAinsTT, p.Q209H c.627 A > C, and GNA11 p.R183C c.547 C > T, or 1 positive droplet for GNAQ p.Q209P c.626 A > C, p.R183Q c.548 G > A, p.Q209L c.626 A > T, p.Q209R c.626 A > G, and GNA11 p.Q209L c.626 A > T per analysis. Samples with fewer than 300 amplifiable genomes and no ctDNA detected by ddPCR were excluded from further ctDNA analysis.

Statistical analyzes

Categorical variables were compared using Pearson’s Chi-squared test or Fisher’s exact test. For continuous variables, Mann-Whitney test and Wilcoxon rank sum tests were employed. Pearson coefficient was used for assessing correlations. Kaplan–Meier plots were used for survival analysis, with significance assessed using the logrank test. PFS was defined as the time from tebentafusp initiation to disease progression or death. Patients without disease progression were censored at their last follow-up visit. OS was defined as the time from tebentafusp commencement to death from any cause. Cox proportional hazards modeling, encompassing univariate and multivariate analyzes, evaluated the prognostic significance of different predictors. Model 1 assessed variables with a p-value < 0.1 in univariate analysis, including ctDNA detection at baseline and other relevant pathological characteristics. Model 2 replaced ctDNA detection at baseline with ctDNA clearance at 12 weeks as an independent variable. Statistical analyzes were conducted using GraphPad Prism (version 8.0) or R software (version 4.1.1), two-sided p-values ≤ 0.05 indicating statistical significance. No formal adjustment for multiplicity was performed.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.