Abstract
The Quality of Life (QoL) Utility measure, QLU-C10D, is derived of the European Organisation for Research and Treatment of Cancer (EORTC) QoL Questionnaire, QLQ-C30. Based on the cancer-specific nature, the QLU-C10D is expected to be sensitive and responsive in lung cancer patients.This retrospective analysis used data from four international lung cancer multi-center trials (NCT00656136, NCT00949650, NCT01085136, NCT01523587). Clinical validity was assessed in comparison to a generic standard utility instrument, the EuroQoL Group´s EQ-5D-3L. Utilities of six country value sets (Australia, Canada, Italy, the Netherlands, Poland, UK) were calculated at baseline and end of treatment for both measures. Country value set pairs of both measures (k) were compared in terms of Relative Efficiency (RE) and difference in Effect Sizes (dES) in 1) sensitivity to detect differences between performance status groups and 2) responsiveness to changes at each trial sample. Analysis of the four trials (N1 = 496, N2 = 290, N3 = 202, N4 = 770) with the six country value sets of each utility measure showed ad 1) Sensitivity indices favored the QLU-C10D (k = 18, p ≤ 0.019; RE > 1.10; dES > 0.03), and ad 2) Responsiveness indices of changes within clinically known groups (k = 78), largely favored QLU-C10D (k = 74, p ≤ .024; RE > 1.01; dES > 0.02), in comparison with the generic utility instrument. In summary, 96% of the comparative indices favored the QLU-C10D. In summary, this study confirms the clinical validity of the QLU-C10D in lung cancer patients. The QLU-C10D produced homogenous results across six country value sets and detected differences/changes in alignment with clinical expectations. In most comparisons the QLU-C10D was more sensitive or responsive compared to the EQ-5D-3L.
Similar content being viewed by others
Introduction
Globally, lung cancer is the second most common type of cancer1 and the most common cause of cancer-related deaths2. Late disease detection, and lesions in advanced disease stages, contribute to the low 5-year survival rate of less than 20% across all disease stages3. With respect to limited curative perspectives, assessing lung cancer patients’ health-related quality of life (HRQoL) is therefore equally as important as clinical outcome measures to assess treatment options, like overall survival or toxicity profiles of different treatments4,5. Therefore, HRQoL measures are a valuable source of information in clinical practice as well as in clinical trials and health technology assessments6,7.
Usually, self-report questionnaires are used to assess HRQoL of patients on different dimensions with several severity levels. HRQoL utility measures provide a scoring algorithm to convert the individual responses into one utility value of the described health state, which are anchored at 0 (representing being dead) and 1 (full health)8. Such value sets for health states are usually developed separately for countries with samples of the general population, so that they represent health preferences of these specific populations, for example to inform cost-utility analyses9,10,11.
An ongoing topic of discussion relates to which HRQoL utility measures reflect patients’ health-related experience adequately12,13: Whereby two types of HRQoL utility measures – generic and condition-specific – constitute a theoretical juxtaposition. Generic measures cover HRQoL dimensions relevant to most health conditions and therefore make the outcomes comparable across different diseases14,15. Among these the EQ-5D-3L16,17 is the most widely used and well validated for a range of disease contexts14. Also, it is the standard utility measure of health-care jurisdictions in some countries18. The EQ-5D-3L, assessing five dimensions of generic functioning and symptoms with three severity levels, can describe 35 = 243 unique health states.
Nevertheless, there are concerns that the EQ-5D-3L might not capture all important aspects of HRQoL in cancer patients19,20. Clinical validity is a crucial characteristic of HRQoL utility measures. Here we focus on the ability of a utility measure to detect differences between clinical groups (sensitivity) and responsiveness to change21,22.
Addressing such concerns, the Multi-Attribute Utility in Cancer Consortium developed the Quality of Life Utility-Core 10 Dimensions – QLU-C10D23 – a cancer-specific HRQoL utility measure. The QLU-C10D is a scoring algorithm for the most widely used HRQoL measure in oncology research24, the EORTC QLQ-C3025. In addition to the scoring algorithm, various value sets for its use in several countries have been developed (e.g. 9, 10, 11). The QLU-C10D captures ten HRQoL dimensions, including common cancer symptoms such as fatigue and nausea, each on four severity levels. Hence, the QLU-C10D can describe 410 = 1,048,576 health states. It can therefore be hypothesized that this cancer-specific utility measure could improve the sensitivity and responsiveness of measurements. Early evidence supporting this hypothesis was reported for patients with myelodysplastic syndrome26, esophageal cancer patients27, a group of mixed cancer entities treated with nivolumab28, and four cancer trials in the Netherlands29; so far QLU-C10D sensitivity and responsiveness at patients with lung cancer has not been assessed30.
Aim of this study
The main objective of this retrospective study was to evaluate clinical validity of the QLU-C10D. It was expected that the cancer-specific QLU-C10D would discriminate between clinically known groups (sensitivity) and reflect changes of patients’ health states (responsiveness), two aspects of psychometric properties which have not been assessed yet in a multinational set of lung cancer patients. Further, we expected that differences or changes should be found more pronounced with the cancer-specific measure, than with the generic comparative instrument, the EQ-5D-3L. For this analysis we used six corresponding country-specific value sets of the QLU-C10D and the EQ-5D-3L: Australia (AUS)31,43, Canada (CAN)32,44, Italy (ITA)33,45, the Netherlands (NLD)34,46, Poland (POL)33, 47, and the United Kingdom (UK)35,48.
Methods
Study population, data selection
This retrospective analysis relies on data of four international multi-center LUX-Lung (LL) trials (LL1: NCT00656136, LL3: NCT00949650, LL5: NCT01085136, LL8: NCT01523587). Participants were non-small-cell lung cancer (NSCLC) patients with activating EGFR mutations (LL136,37, LL338,39, LL540), or squamous cell carcinoma (LL841). The studies investigated the efficacy of Afatinib in comparison to placebo (LL1), chemotherapy (LL3, LL5), and Erlotinib (LL8). All LL trials collected HRQoL data using both the EORTC QLQ-C30 and the EQ-5D-3L. The longitudinal analysis comprises of the time between baseline and end of treatment (EOT), as at these timepoints data of known-group variables, e. g. adverse events, was available. Also, a pooled dataset of progressive patients of LL1 and LL8, and a subset of progressive patients treated with Afatinib, were analyzed. The timeframe of this analysis spans from baseline to date of progression ± four weeks. These dates were available only for patients of LL1 and LL8.
Data for analyses was included if both HRQoL measures were completed at baseline and EOT. A flow chart of data selection is presented in Fig. 1. We compared original study sample characteristics with included data to check for potential selection bias, considering misalignment of more than 10% between the groups a signal for bias42, see Fig. 2.
Sample Characteristics at baseline & results of bias analysis, † Patients with squamous cell carcinoma of the lung at LL8; all other trials non-small-cell lung cancer patients; * (A) (LL1, LL3, LL8) = Afatinib; LL5 = Afatinib + Paclitaxel; (B) LL1 = Placebo, LL3 = Pemetrexed + Cisplatin, LL5 = Investigator’s Choice Chemotherapy, LL8 = Erlotinib; ** progressive patients with dates of progression only, N = 76 from LL1, N = 78 from LL8; patients treated with Afatinib is a subgroup of the pooled data sample; ‡ Original sample characteristics as reported by the study authors at baseline are presented in the rows below. Differences between original sample and selected sample below a margin of 10% were considered negligible. Abbreviations: ECOG,eastern cooperative oncology group performance status scale, LL#,Lux-Lung study number. AUS Australia, CAN Canada, ITA Italy, NLD the Netherlands, POL Poland, UK the United Kingdom.
Measures
EORTC QLQ-C30
The EORTC QLQ-C3025 consists of 30 questions that form 15 scales. Five of them are functioning scales (e. g. physical, cognitive); nine are symptom scales (e. g. pain, appetite loss). One scale covers global health status. Responses are provided on a four-point Likert scale (“not at all”, “a little”, “moderate”, “very much”) for all questions with the exception of the global health status items, which have seven response levels.
EORTC QLU-C10D
The QLU-C10D23 is a preference-based scoring algorithm for the QLQ-C30 and enables health utilities to be derived from QLQ-C30 data. It uses 13 of the 30 items, which form ten dimensions (physical functioning, role functioning, social functioning, emotional functioning, pain, fatigue, sleep, appetite, nausea, and bowel problems).
EQ-5D-3L
The EQ-5D-3L, developed by the EuroQol Group16,17, captures five dimensions – mobility, self-care, usual activities, pain/discomfort, anxiety/depression – with three response options each (“no problems”, “some problems”, “severe problems/unable to”) and is a validated instrument for a range of diseases, including cancer.
Statistical analysis
Descriptive statistics are presented as absolute numbers (N/n = total sample/group size, k = number of tests), central tendency (mean and/or median), standard deviation (SD), frequencies and percentages (%). Each data set of the four trials was analyzed separately. Additionally data of progressive patients of LL1 and LL8 were pooled for longitudinal analysis from baseline to progression date (± four weeks). Data analysis was performed at an R-based online platform with temporal access to original study data provided by the data owner.
The outcome measure, country-specific utilities of both measures, was calculated according to previously published algorithms (AUS31,43, CAN32,44, ITA33,45, NLD34,46, POL33,47, UK35,48). Code for utility calculation is available at the specific publication or at the EORTC Quality of Life Group upon request (https://qol.eortc.org/eortc-qlu-c10d/).
Clinically known groups, i.e. patient groups that were expected to differ with regard to HRQoL, were49: Eastern Cooperative Oncology Group (ECOG) Performance Status Scale (PS), comparing PS 0 (fully active) vs. PS > 0 (restricted abilities) for cross-sectional sensitivity analysis at baseline. ECOG PS was available for LL1, LL2 and LL8; Responsiveness to change was analyzed with known-groups for 1) treatment arms, 2) number of Adverse Events (AE) encountered (≤ 3 vs. > 3) with data record of LL3, LL5 and LL8, and 3) response groups (stable vs. progressive disease) with data record of LL1 patients. We expected to find, that patients with PS > 0, AE > 3, and progressive disease show lower utilities scores than comparator groups. In line with original studies reports, we did not expect significant differences between treatment groups. However, we expected less decline in utilities within the treatment groups than in the comparator group in LL339, LL540, and LL841. Conversely for LL137 patients, we expected more decline in utilities within the afatinib group than in the placebo group, according to primary study report. Also, change from baseline until disease progression was analyzed for the pooled data of progressive patients and the subset of Afatinib-treatment group. We expected to find, that utilities at ± 4 weeks from progression date would be lower than at baseline.
Sensitivity (ability to detect differences between groups) and responsiveness (ability to detect change) of QLU-C10D was assessed with independent and paired t-tests, applying an α-level of 5% and Bonferroni-Holm correction to adjust for multiple testing50,51. Each sample was analyzed separately, with six value sets of each utility measure. So for example, sensitivity analysis with ECOG PS comprised of data from three studies, with outcome variable of six country-specific value sets of each utility measure (3 × 6 × 2) = 36 independent t-tests. Responsiveness analysis comprised of t-tests within and between sub-groups with six country-specific value sets of each utility measure (outcome variables) and for each available samples. For example, at responsiveness for treatment arms with four samples (3 × 6 × 2 × 4 =) 144 t-tests were calculated. T-tests for further analysis were considered only when at least one of the utility measures detected significant differences or change.
We used standard effect sizes to estimate magnitude of difference or change for each specific test approach (cross-sectional, longitudinal within and between groups). Cohen’s d, difference between groups divided by pooled SD of both subgroups, served as the effect size (ES) measure for between group comparisons. Scores ≥ 0.5 being considered moderate, and ≥ 0.8 being considered large52. As effect size for change at the pooled data set, the Standardized Response Mean (SRM) was calculated by dividing mean change by the SD of the change53. The Responsiveness Index (RI) was calculated to quantify change within clinically known sub-groups54, as the ratio of mean change in the intervention/event group (treatment arm, > 3 AE, progressive patients) and the SD of the comparator group (control arm, ≤ 3 AE, stable patients).
For comparison of utility measures – Relative Sensitivity and Relative Responsiveness – two sets of parameters were used. The first is based on the t-values of the t-tests, i.e. the detected difference relative to sample variation. T-values are directly related with statistical power: The measure which calculates higher t-values, detects larger effects with the same sample size as the comparator measure. Relative efficiency (RE) was used as parameter for this comparison of QLU-C10D and EQ-5D-3L utilities55. RE is the quotient of positive t-values, with the QLU-C10D in the numerator and EQ-5D-3L in the denominator. RE values larger than “1” indicate higher efficiency of the QLU-C10D, while values smaller than “1” indicate higher efficiency of the EQ-5D-3L.
The second set of parameters compares both utility measures based on ES. We premise that the utility measure which detects larger ES shows higher sensitivity or responsiveness for differences or change at the specific groups than the comparator utility measure. We calculated the differences between positive ES, i.e. QLU-C10D minus EQ-5D-3L. For example the difference in RI (DRI) is calculated RI of QLU-C10D minus RI of EQ-5D-3L. Differences in ES > 0 favor the QLU-C10D and differences in ES < 0 favor the EQ-5D-3L.
To compare utility measures with regard to measured ES, it is noteworthy that ES are influenced by the observed differences and the type of standard deviation used, such as the pooled SD or the SD from a control group53. Conversely, RE is not biased in this regard, but this value is abstract and cannot be interpreted in terms of magnitude. To compare utility measures with regard to magnitude of detected differences, difference in ES was used with caution.
Results
A total of 496 patients of LL1, 290 of LL3, 194 of LL5, and 719 of LL8 were eligible for analyses (Fig. 1). The proportion of female patients ranged from 16 to 63%, the median patient age from 57 to 64 years, and proportions of patients with ECOG PS = 0 between 24 and 39%. Baseline utility scores of both measures were lower than respective population norms in all trial samples56,57,58. Selection bias analysis, i. e. comparing characteristics of the original study populations with the included data for this analysis, showed no relevant misalignment (Fig. 2). This result indicates that potential selection bias is negligible.
QLU-C10D sensitivity for ECOG PS
ECOG PS was available for LL1, LL3, and LL8. Significant differences between ECOG groups (independent t-tests, k = 36) were found by both instruments with all six country value sets in the expected direction (higher utilities for ECOG PS = 0 than PS > 0). ES of differences between groups, Cohen’s d, of QLU-C10D country value sets was between -0.770 (POL) and -0.609 (NLD) (see Fig. 3, details in Appendix A).
Sensitivity Analysis Summary of QLU-C10D for ECOG Performance status 0 (= fully active) vs. > 0 (restricted abilities) at Baseline across Three Lux-Lung Trials (LL1, LL3, LL8), and Relative Sensitivity compared to EQ-5D-3L, Using Utility Scores from Six Country-Specific Value Sets (Australia, Canada, Italy, the Netherlands, Poland, and the United Kingdom). Detailed results are presented in Appendix A; LL# Lux-Lung study number, k number of t-tests, 12 t-tests for each LL study (6 country value sets per utility measure); * p-values are Bonferroni-Holm corrected; ** Difference Cohen’s d = QLU-C10D minus EQ-5D-3L; Values > 0 favor QLU-C10D, < 0 favor EQ-5D-3L; *** Relative Efficiency: Values > 1 favor QLU-C10D, < 1 favor EQ-5D-3L.
QLU-C10D responsiveness to change
Difference of mean change between treatment arms was not significant, as expected. Significant utility changes within treatment arms (paired t-tests, k = 36) were found to be in the expected directions and statistically significant in LL1, LL5 and LL8 with QLU-C10D value sets (see overview in Fig. 4, and Appendix C). ES of within group change, RI, ranged from -0.657 (POL) to -0.056 (CAN).
Responsiveness Analysis Summary** of QLU-C10D across Four Lux-Lung Trial Datasets (LL1, LL3, LL5, LL8) in Detecting Change in Progressive Subjects, Treatment Arms, Adverse Events Groups, and Response Groups (stable vs. progressive), and Relative Responsiveness Compared to EQ-5D-3L Using Six Country-Specific Value Sets (Australia, Canada, Italy, the Netherlands, Poland, and the United Kingdom). t-test (all country value sets = 6 per cell) with Bonferroni-Holm corrected p-values: n. s. = not significant (p > .05), sig. = significant (p ≤ .05); “Diff.” = ES-QLU-C10D minus ES-EQ-5D-3L (Differences in Effect Sizes (ES) of QLU-C10D and EQ-5D3L country value set pairs); positive differences favor QLU-C10D, negative differences favor EQ-5D-3L;DRI difference responsiveness index (RI), SRM standardized response mean, RE (Relative Efficiency) = t-value-QLU-C10D divided by t-value-EQ-5D-3L; RE < 1 favors EQ-5D-3L, RE > 1 favors QLU-C10D; † Progressive Patients (LL1 + LL8): change from baseline to date of progression + /- 4 weeks; All Treat. = pooled data of all treatments (Afatinib monotherapy, Afatinib combinations, etc.); Afatinib = pooled data of Afatinib monotherapy; * [Treatment / Control]: LL1 [Afatinib monotherapy / placebo], LL3 [Afatinib / chemotherapy (Cisplatin + Pemetrexed)], LL5 [Afatinib + Paclitaxel / single-agent chemotherapy], LL8 [Afatinib / Erlotinib]. ** Detailed results are presented in Appendix B – E.
Significant difference in mean change between AE groups (independent t-tests, k = 6) yielded LL8. ES for between groups differences, Cohen’s d, ranged from -0.778 (POL) to -0.714 (ITA) with QLU-C10D value sets. Changes within AE sub-groups were significant in LL5 and LL8 (paired t-tests, k = 24). RI ranged from -0.355 (UK) to -0.267 (NLD) with QLU-C10D value sets (see overview in Fig. 4, and Appendix D).
Difference in mean change between stable and progressing groups yielded no significant results. The QLU-C10D showed significant results for change within response groups of LL1 (k = 12; see Fig. 4, and Appendix E). Within group ES, RI, ranged from -0.515 (POL) to -0.405 (CAN).
Within-subject changes from baseline to progression date of progressive patients and the Afatinib-treatment subgroup yielded significant results with all QLU-C10D value sets (k = 12; see overview in Fig. 4, and Appendix B). Effect size, SRM, ranged between -0.608 (POL) and -0.507 (NLD).
Relative sensitivity to detect known groups
In terms of sensitivity to ECOG PS, RE were > 1 and ES were larger for the QLU-C10D value sets in 18 of 18 country-wise comparisons with EQ-5D-3L, which indicates higher efficiency of the QLU-C10D (see Figs. 3 & 5). Difference in ES, Cohen´s d, ranged from 0.03 to 0.43, with smallest differences between CAN and NLD value set pairs, and largest differences with POL value set pairs.
Relative Sensitivity of QLU-C10D and EQ-5D-3L for Clinically Known Groups (ECOG) across Three Lux-Lung Trials (LL1, LL3, LL8), and Relative Responsiveness of QLU-C10D and EQ-5D-3L to Changes from Baseline to Progression Date in Pooled Data of Lux-Lung 1 and Lux-Lung 8, Using Six Country-Specific Value Sets (Australia, Canada, Italy, the Netherlands, Poland, and the United Kingdom). (A) Horizontal graphs represent range of RE values with six country value set pairs (QLU-C10D & EQ-5D-3L). Icons at each endpoint of the graphs signify country value set pairs with lowest and highest RE value. RE > 1 favors QLU-C10D, RE < 1 favors EQ-5D-3L. Values exceeding the x-axis scale are marked with grey icon “ > ”; (B) Horizontal graphs represent range of differences between effect sizes (LL1, LL3, LL8 = Cohen’s d; pooled data = SRM) calculated with each of the six country value set pairs of the two measures. Icons at the endpoints of the graphs signify country value set pairs with the largest and smallest distances from 0 (= no difference in effect sizes). Differences > 0 favor QLU-C10D, < 0 favor EQ-5D-3L. Abbreviations: ECOG Eastern Cooperative Oncology Group performance status scale; EOT end of treatment; LL# lux-lung trial number; Pooled All progressive patients of LL1 & LL8, Pooled Afa. progressive Patients of LL1 & LL8 treated with Afatinib, RE relative efficiency, SRM standardized response mean, AUS Australia, CAN Canada, ITA Italy, NLD the Netherlands, POL Poland,UK the United Kingdom.
Relative responsiveness to change
Treatment arms
Relative responsiveness analysis of treatment arms relied on 72 paired t-tests at LL1, LL5 and LL8 of QLU-C10D and EQ-5D-3L. RE favored the QLU-C10D value sets in 33 of 36 country-wise comparisons with EQ-5D-3L (Fig. 6a). LL5 treatment group CAN and NLD value set pairs, and LL8 control group NLD value set pair favored the EQ-5D-3L. Difference in RI (DRI) favored the QLU-C10D value sets in 18 of 18 comparisons (Fig. 6b). DRI ranged from 0.02 to 0.28, with smallest differences between ITA and NLD country value set pairs, and largest with POL value set pairs.
Relative Responsiveness of QLU-C10D and EQ-5D-3L to Changes in Treatment Arms, Adverse Event (AE) Groups, and Response Groups from Baseline to End Of Treatment across Four Lux-Lung Trials (LL1, LL3, LL5, LL8), Using Six Country-Specific Value Sets (Australia, Canada, Italy, the Netherlands, Poland, and the United Kingdom). “b” = between groups; “w” = within groups; (A & C) Horizontal graphs represent range of RE values with six country value set pairs (QLU-C10D & EQ-5D-3L). Icons at each endpoint of the graphs signify country value set pairs with lowest and highest RE value. RE > 1 favors QLU-C10D, RE < 1 favors EQ-5D-3L. (B & D) Horizontal graphs represent range of differences between ES (between groups = Cohen’s d; within groups = DRI) calculated with each of the six country value set pairs of the two measures. Icons at the endpoints of the graphs signify country value sets with the smallest and largest distances from 0 (= no difference in ES). ES-Differences > 0 favor QLU-C10D, ES-Differences < 0 favor EQ-5D-3L. Abbreviations: AE adverse events; DRI difference responsiveness index, ES effect size; LL# lux-lung trial number, RE relative efficiency, AUS Australia, CAN Canada, ITA Italy, NLD the Netherlands, POL Poland, UK the United Kingdom.
Adverse events
Responsiveness analysis of AE groups relied on 12 independent (LL8) and 36 paired (LL5, LL8) t-tests of QLU-C10D and EQ-5D-3L. RE favored the QLU-C10D value sets in 23 of 24 country-wise comparisons with EQ-5D-3L. RE of LL8 between AE groups (k = 6) favored the QLU-C10D value sets (Fig. 6c). RE of within AE groups of LL5 and LL8 (k = 18) favored the QLU-C10D value-sets, with one exception (NLD). DRI favored QLU-C10D value sets in 9 of 12 comparisons with EQ-5D-3L and ranged between -0.09 and 0.14. Difference in Cohen’s d favored QLU-C10D value sets in 6 of 6 cases and ranged between 0.02 and 0.19. CAN and ITA value set pairs showed smallest, and POL value set pair largest differences (Fig. 6d).
Response groups
Relative responsiveness analysis relied on 24 paired t-tests of QLU-C10D and EQ-5D-3L. RE favored the QLU-C10D value sets in 12 of 12 country-wise comparisons with EQ-5D-3L. DRI favored the QLU-C10D value sets in 4 of 6 country-wise comparisons with EQ-5D-3L and ranged between -0.04 and 0.12; the exceptions were CAN and ITA value set pairs. CAN value set pair showed smallest, and POL largest differences (Fig. 6c ,d).
Progressive patients
Analysis relied on 24 paired t-tests of QLU-C10D and EQ-5D-3L. RE (k = 12) and difference in SRM (k = 12) favored the QLU-C10D value sets in all country-wise comparisons with EQ-5D-3L (Fig. 5a,b). Differences in SRM ranged between 0.18 and 0.26. Smallest differences showed the NLD and largest differences the POL value set pair.
In summary, RE favored the QLU-C10D in 6 of 6 between-group comparisons, and in 74 of 78 within-group comparisons with corresponding EQ-5D-3L value sets. The ITA and CAN value set pairs of both utility measures showed frequently small, and POL value set pair frequently large differences in ES. In total, 96% of sensitivity and responsiveness indices favored the QLU-C10D value sets compared with corresponding EQ-5D-3L value sets.
Discussion
The primary aim of this study was to assess sensitivity and responsiveness of the cancer-specific QLU-C10D utility measure for lung cancer patients with six country-specific value sets (Australia, Canada, Italy, the Netherlands, Poland and the United Kingdom). The EQ-5D-3L with its respective country-specific value sets served as a comparative measure. Basic psychometric properties of QLU-C10D, such as convergent and divergent validity and floor and ceiling effects, have been presented elsewhere30.
Overall, the QLU-C10D showed good sensitivity and responsiveness. The six QLU-C10D value sets showed consistency in size and direction of ES for NSCLC patients, i.e. although utility scores differed across value sets, they produced similar results with regard to the detection of group differences and changes over time. However, even small differences in ES can impact study results. Hence the choice of value set requires careful consideration especially in cases when there is no national value set for the country of interest available and a surrogate has to be selected.
Relative sensitivity analysis, i.e. comparing the QLU-C10D with the EQ-5D-3L value set pairs at detecting ECOG PS groups, has shown similar results as reported previously for other cancer entities27,29. These results support the QLU-C10D clinical validity and comparatively higher sensitivity for this clinically known group. These consistent results bear special importance in the field of NSCLC as the QLQ-C30, on which the calculation of the QLU-C10D scores are based, was originally developed for a lung cancer population26.
At relative responsiveness analysis, when comparing the ability of the QLU-C10D and the EQ-5D-3L to detect differences in change within treatment groups, significant results were found in the expected directions with QLU-C10D in three trials, but not in LL3. While QLU-C10D showed marginally wrong directions with three value sets (AUS, NLD, POL), the EQ-5D-3L showed larger score differences in the wrong direction with all six value sets.
Both measures detected expected differences in change between Adverse Events groups of LL8. This analysis showed favorable results for the QLU-C10D. T-tests for differences in change between Adverse Events groups at LL3 and LL5 were not significant, probably because of the small size of “AE > 3”-groups (n = 17, n = 16) at these trials.
Comparing responsiveness to within-subject change in progressive patients of the EQ-5D-3L and the QLU-C10D, the latter showed larger mean score changes. These observations are somewhat consistent with the findings of Shaw et al.28, who reported smaller mean score changes detected by the EQ-5D-3L compared with the QLU-C10D for worsened outcomes. Moreover, Bulamu et al.27 reported a high degree of congruence for both measures at mild health states. In later assessments, when health states deteriorated, they also observed stronger deviation between both measures.
Comparing the QLU-C10D and EQ-5D-3L in terms of detected effect sizes, differences in effects ranged from negligible importance up to the realm of one category of magnitude and more. For example, effect sizes for utility difference in being fully active vs. restricted abilities (ECOG PS) ranged from small to medium when calculated with EQ-5D-3L value sets, but with QLU-C10D value sets effect sizes ranged from medium to large. The same pattern was observed at comparative responsiveness analysis at progressing patients. We assume, that differences in effect sizes not only result from differences in utility scores, but also to some extent from standard deviations of both measures, which are frequently smaller with QLU-C10D value sets than with EQ-5D-3L. In summary, analysis of effect sizes indicates that country value sets impact differences between QLU-C10D and EQ-5D-3L to various degrees of magnitude.
The main limitation of our study is its retrospective nature. Although this analysis relies on robust data collected from well-designed and controlled international clinical trials, some potentially profitable information was lacking, which was naturally not collected within these studies. For example, the comparison could only be made with the 3-level version of the EQ-5D. Comparisons with the newer, and likely more sensitive, 5-level EQ-5D59 could not be conducted. Therefore any advantage of the QLU-C10D could be lower if the comparison is made with the 5L-version, e.g. as reported by Jansen et al.60. Also, other alternative generic utility measures, such as the SF-6D61, could not be considered here as comparator measures.
A subsequent question that should be explored in future work is the impact on cost-utility analysis from using either instrument when evaluating the cost-effectiveness of interventions. This would likely depend on a number of factors, including the responsiveness of the respective instruments, and the scale of the value sets.
In summary, sensitivity and responsiveness analysis substantiate the validity of the QLU-C10D six country-specific value sets as a utility measure in NSCLC populations. Comparisons with the generic EQ-5D-3L showed favorable results for the QLU-C10D in the large majority of cases (96%). Results of both measures indicate that, although often similar, differences in measurement properties exist, and that the two measures cannot not be considered interchangeable.
Conclusions
In conclusion, our results show that the QLU-C10D is a clinically valid utility measure in lung cancer and that it bears advantages with regard to measurement precision compared to the EQ-5D-3L. This adds to previously published evidence on the QLU-C10D in other cancer populations. Better measurement properties translate into smaller sample sizes needed to detect effects. Hence, the QLU-C10D utilities derived from QLQ-C30 scores can be a valuable source of information in health economic evaluations of lung cancer patients in addition to the clinical information provided by the QLQ-C30.
Data availability
The data that support the findings of this study are available from Boehringer Ingelheim but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Boehringer Ingelheim.
References
International agency for research on cancer. Cancer today: Estimated number of new cases in 2020, World, both sexes, all ages (excl. NMSC). https://gco.iarc.fr/today/online-analysis-table?v=2020&mode=cancer&mode_population=continents&population=900&populations=900&key=asr&sex=0&cancer=39&type=0&statistic=5&prevalence=0&population_group=0&ages_group%5B%5D=0&ages_group%5B%5D=17&group_cancer=1&include_nmsc=0&include_nmsc_other=1. Accessed January 20, 2023.
Schwartz, A. G. & Cote, M. L. Epidemiology of lung cancer. Adv. Exp. Med. Biol. 893, 21–41 (2016).
Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2018. CA - A Cancer Journal for Clinicians. 68(1), 7–30 (2018).
Shah, K. K. et al. Adapting preference-based utility measures to capture the impact of cancer treatment-related symptoms. Europ. J. Health Econ. 22(8), 1301–1309 (2021).
Pratt-Chapman, M. B. A. Patient-reported outcomes in health economic decision-making: a changing landscape in oncology. Recent Results Cancer Res. 213, 67–83 (2019).
Trippoli, S., Vaiani, M., Lucioni, C. & Messori, A. Quality of life and utility in patients with non-small cell lung cancer quality-of-life study group of the master 2 project. Pharmacoecon. Pharmacoecon. https://doi.org/10.2165/00019053-200119080-00007 (2001).
Griebsch, I., Palmer, M., Fayers, P. M. & Ellis, S. Is progression-free survival associated with a better health-related quality of life in patients with lung cancer Evidence from two randomised trials with afatinib. BMJ Open https://doi.org/10.1136/bmjopen-2014-005762 (2014).
Roudijk, B., Donders, A. R. T. & Stalmeier, P. F. M. Setting dead at zero: applying scale properties to the QALY MODEL. Med. Decision Making. 38(6), 627–634 (2016).
Finch, A. P. et al. EORTC Quality of Life Group. Estimation of an EORTC QLU-C10 Value Set for Spain Using a Discrete Choice Experiment. Pharmacoeconomics. 39, 1085–1098 (2021).
Kemmler, G. et al. German value sets for the EORTC QLU-C10D, a cancer-specific utility instrument based on the EORTC QLQ-C30. Quality Life Res. 28(12), 3197–3211 (2019).
Nerich, V. et al. French value-set of the QLU-C10D, a cancer-specific utility measure derived from the QLQ-C30. Appl. Health Econ. Health Policy. 19(2), 191–202 (2021).
McDonough, C. M. T. A. Measuring preferences for cost-utility analysis: how choice of method may influence decision-making. Pharmacoeconomics 25(2), 93–106 (2007).
Shah, K. K., Mulhern, B., Longworth, L. & Janssen, M. F. Views of the UK general public on important aspects of health not captured by EQ-5D. Patient 10(6), 701–709 (2017).
Langworth, L. et al. Use of generic and condition-specific measures of health-related quality of life in NICE decision-making: a systematic review, statistical modelling and survey. Health Technol. Assess. 9(18), 1–224 (2014).
Devlin, N. J. & Lorgelly, P. K. QALYs as a measure of value in cancer. J. Cancer Policy. 11, 19–25 (2017).
EuroQol Group. EuroQol–a new facility for the measurement of health-related quality of life. Health Policy. 16(2), 199–208 (1990).
Rabin, R. & de Charro, F. EQ-5D: a measure of health status from the EuroQol group. Ann. Medicine. 33(5), 337–343 (2001).
Sonntag, M., König, H. H. & Konnopka, A. The estimation of utility weights in cost-utility analysis for mental disorders: a systematic review. Pharmacoeconomics 31(12), 1131–1154 (2013).
Rowen, D., Brazier, J., Ara, R. & Azzabi, Z. I. The role of condition-specific preference-based measures in health technology assessment. Pharmacoeconomics 35(Suppl. 1), 33–41 (2017).
Drummond, M. & Sculpher, M. Common methodological flaws in economic evaluations. Med. Care. 43(Suppl. 7), 5–14 (2005).
Mokkink, L. B. et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J. Clin. Epidemiol. 63(7), 737–745 (2010).
Koller, M. et al. European organisation for research and treatment of cancer (EORTC) Quality of life group, EORTC lung cancer group, European society of thoracic surgeons. Psychometric properties of the updated EORTC module for assessing quality of life in patients with lung cancer (QLQ-LC29): an international, observational field study. Lancet Oncology. 21, 723–732 (2020).
King, M. T. et al. QLU-C10D: a health state classification system for a multi-attribute utility measure based on the EORTC QLQ-C30. Qual. Life Res. 25(3), 625–636 (2016).
Giesinger, J. M. et al. Past and current practice of patient-reported outcome measurement in randomized cancer clinical trials: a systematic review. Value Health. 24(4), 585–591 (2021).
Aaronson, N. K. et al. The European organization for research and treatment of cancer QLQ-c30: a quality-of-life instrument for use in international clinical trials in oncology. J. Nat. Cancer Inst. 85(5), 365–376 (1993).
Gamper, E. M. et al. The EORTC QLU-C10D was more efficient in detecting clinical known group differences in myelodysplastic syndromes than the EQ-5D-3L. J. Clin. Epidemiol. 137, 31–44 (2021).
Bulamu, N. B. et al. Australian immunonutrition study group responsiveness and convergent validity of QLU-C10D and EQ-5D-3L in assessing short-term quality of life following esophagectomy. Health Quality Life Outcomes https://doi.org/10.1186/s12955-021-01867-w (2021).
Shaw, J. W. et al. A comparison of generic and condition-specific preference-based measures using data from nivolumab trials: EQ-5D-3L, mapping to the EQ-5D-5L, and European organisation for research and treatment of cancer quality of life utility measure-core 10 dimensions. Value in Health. 24(11), 1651–1659 (2021).
Pilz, M. J., Seyringer, S., Hallsson, L. R., Bottomley, A., Jansen, F., King, M. T., Norman, R., Rutten, M. J., Verdonck-de Leeuw, I. M., Siersema, P. D., Gamper, E. M. (2023). A psychometric evaluation of the Dutch version of the cancer-specific preference-based measure EORTC QLU-C10D in comparison to the EQ-5D-3L. On behalf of the EORTC Quality of Life Group
Pilz, M. J., Seyringer, S., Al-Naesan, I., King, M. T., Bottomley, A., Norman, R., Schlosser, L., Hell, T., Gamper, E. M. (2023). Cancer-specific health utilities: Evaluation of core measurement properties of the EORTC QLU-C10D in lung cancer patients – data from four multicentre LUX-Lung trials, applying six country tariffs. On behalf of the EORTC Quality of Life Group
King, M. T. et al. Australian utility weights for the EORTC QLU-C10D, a multi-attribute utility instrument derived from the cancer-specific quality of life questionnaire, EORTC QLQ-C30. Pharmacoeconomics. 36(2), 225–238 (2018).
McTaggart-Cowan, H. et al. The EORTC QLU-C10D the Canadian valuation study and algorithm to derive cancer-specific utilities from the EORTC QLQ-C30. MDM Policy Pract. https://doi.org/10.1177/2381468319842532 (2019).
Gamper, E. M. et al. EORTC QLU-C10D value sets for Austria, Italy, and Poland. Qual. Life Res. 29(9), 2485–2495 (2020).
Jansen, F. et al. Dutch utility weights for the EORTC cancer-specific utility instrument: the Dutch EORTC QLU-C10D. Qual. Life Res. 30(7), 2009–2019 (2021).
Norman, R. et al. European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group and the MAUCa Consortium. U.K. utility weights for the EORTC QLU-C10D. Health Econ. 28, 1385–1401 (2019).
Miller, V. A. et al. Afatinib versus placebo for patients with advanced, metastatic non-small-cell lung cancer after failure of erlotinib, gefitinib, or both, and one or two lines of chemotherapy (LUX-Lung 1): a phase 2b/3 randomised trial. Lancet Oncol. 13(5), 528–538 (2013).
Hirsh, V. et al. Symptom and quality of life benefit of afatinib in advanced non–small-cell lung cancer patients previously treated with erlotinib or gefitinib: results of a randomized phase IIb/III trial (LUX-Lung 1). J. Thoracic Oncol. 8(2), 229–237 (2013).
Sequist, L. V. et al. Phase III study of afatinib or cisplatin plus pemetrexed in patients with metastatic lung adenocarcinoma with EGFR mutations. J. Clin. Oncol. 31(27), 3327–3334 (2013).
Yang, J. C. et al. Afatinib versus cisplatin-based chemotherapy for EGFR mutation-positive lung adenocarcinoma (LUX-Lung 3 and LUX-Lung 6): analysis of overall survival data from two randomised, phase 3 trials. Lancet Oncol. 16(2), 141–151 (2015).
Schuler, M. et al. LUX-Lung 5 Investigators. Afatinib beyond progression in patients with non-small-cell lung cancer following chemotherapy, erlotinib/gefitinib and afatinib phase III randomized LUX-Lung 5 trial. Ann. Oncol. 27, 417–423 (2016).
Soria, J. C. et al. Goss GD, LUX-Lung 8 Investigators. Afatinib versus erlotinib as second-line treatment of patients with advanced squamous cell carcinoma of the lung (LUX-Lung 8): an open-label randomised controlled phase 3 trial. Lancet Oncology. 16, 897–907 (2015).
Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA, ed. Cochrane Handbook for Systematic Reviews of Interventions version 6.3 (updated February 2022); 2022. www.training.cochrane.org/handbook.
Viney, R. et al. Time trade-off derived EQ-5D weights for Australia. Value Health. 14(6), 928–936 (2011).
Bansback, N., Tsuchiya, A., Brazier, J. & Anis, A. Canadian valuation of EQ-5D health states: preliminary value set and considerations for future valuation studies. Plos One. 7(2), e31115 (2012).
Scalone, L. et al. Italian population-based values of EQ-5D health states. Value Health. 16(5), 814–822 (2013).
Lamers, L. M., McDonnell, J., Stalmeier, P. F., Krabbe, P. F. & Busschbach, J. J. The Dutch tariff: results and arguments for an effective design for national EQ-5D valuation studies. Health Economics. 15(10), 1121–1131 (2006).
Golicki, D., Jakubczyk, M., Niewada, M., Wrona, W. & Busschbach, J. J. Valuation of EQ-5D health states in Poland: first TTO-based social value set in Central and Eastern Europe. Value in Health. 13(2), 289–297 (2010).
Dolan, P. Modeling valuations for EuroQol health states. Med. Care. 35(11), 1095–1108 (1997).
Revicki, D., Hays, R. D., Cella, D. & Sloan, J. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J. Clin. Epidemiol. 61(2), 102–109 (2008).
Holm, S. A simple sequentially rejective multiple test procedure. Scandinavian J. Stat. 6(2), 65–70 (1979).
Hemmerich, W. (2016). StatistikGuru: Rechner zur Adjustierung des α-Niveaus. Retrieved from https://statistikguru.de/rechner/adjustierung-des-alphaniveaus.html
Cohen Jacob. Statistical Power Analysis for the Behavioral Sciences. 2nd. New York: Routledge; 1988. https://doi.org/10.4324/9780203771587.
Norman, G. R., Wyrwich, K. W. & Patrick, D. L. The mathematical relationship among different forms of responsiveness coefficients. Qual. Life Res. 16(5), 815–822 (2007).
Tuley, M. R., Mulrow, C. D. & McMahan, C. A. Estimating and testing an index of responsiveness and the relationship of the index to power. J. Clin. Epidemiol. 44(4–5), 417–421. https://doi.org/10.1016/0895-4356(91)90080-s (1991).
Liang, M. H., Larson, M. G., Cullen, K. E. & Schwartz, J. A. Comparative measurement efficiency and sensitivity of five health status instruments for arthritis research. Arthritis Rheum. 28(5), 542–547. https://doi.org/10.1002/art.1780280513 (1985).
Pilz, M. J., Nolte, S., Liegl, G., King, M., Norman, R., McTaggart-Cowan, H., … & EORTC Quality of Life Group. The European Organisation for Research and treatment of cancer quality of life utility-core 10 dimensions: development and investigation of general population utility norms for Canada, France, Germany, Italy, Poland, and the United Kingdom. Value Health 26(5), 760–767 (2023).
Clemens, S., Begum, N., Harper, C., Whitty, J. A. & Scuffham, P. A. A comparison of EQ-5D-3L population norms in Queensland, Australia, estimated using utility value sets from Australia, the UK and USA. Qual. Life Res. 23, 2375–2381 (2014).
Janssen, M. F. et al. Population norms for the EQ-5D-3L: a cross-country analysis of population surveys for 20 countries. Europ. J. Health Econ. 20, 205–216 (2019).
Herdman, M. et al. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual. Life Res. 20(10), 1727–1736 (2011).
Jansen, F. et al. Cost-utility and cost-effectiveness of a guided self-help head and neck exercise program for patients treated with total laryngectomy: Results of a multi-center randomized controlled trial. Oral Oncol.. 117, 1 (2021).
Brazier, J., Usherwood, T., Harper, R. & Thomas, K. Deriving a preference-based single index from the UK SF-36 Health Survey. J. Clin. epidemiol. 51(11), 1115–1128 (1998).
Acknowledgements
This research was submitted on behalf of the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group. We thank the Executive Committee, and Emma Lidington, Amy Thomas and Corneel Coens for their support and assistance. Also, Cheryl Whittaker has greatly helped with the language of the manuscript.
Funding
European Organisation for Research and Treatment of Cancer, #12/2016.
Author information
Authors and Affiliations
Consortia
Contributions
- Concept and design: E.G. —Acquisition of data: M.P.—Analysis and interpretation of data: S.S., E.G. —Manuscript preparation: Seyringer, M.P., I.A-L., E.G. —Critical revision of paper: M.K., A.B., R.N., L.S., T.H., E.G. , —Statistical analysis: L.S., T.H.—Provision of study materials: M.P.—Obtaining funding: E.G. —Administrative, technical support: M.P., I.A-L—Supervision: E.G., M.K., A.B.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval
Ethical approval for this study was provided by the ethics board of the Medical University of Innsbruck as an amendment to the study AN 2015–0016.
Consent to participate
All patients provided written informed consent when participating in the original trials this retrospective analysis relies on. The original study protocols were developed in accordance with the Declaration of Helsinki, and ethical approval was sought at participating centres.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Seyringer, S., Pilz, M.J., Al-Naesan, I. et al. Validation of the cancer-specific utility measure EORTC QLU-C10D using evidence from four lung cancer trials covering six country value sets. Sci Rep 15, 14907 (2025). https://doi.org/10.1038/s41598-024-83861-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-83861-y