Introduction

The development and adoption of standardised health and wellbeing metrics in the past few decades has revolutionised how economies allocate resources, with profound positive effects on public health. Emerging metrics such as the WELLBY (Frijters et al. 2020) further exemplify the goal of capturing wellbeing more accurately, and its rapid adoption in the UK and New Zealand (Frijters et al. 2024) is evidence of policymakers’ demand for actionable measures that both capture the relevant dimensions of human flourishing and are simple to implement.

The simplicity that makes such metrics attractive to policymakers makes them simultaneously and predictably the target of much criticism and scrutiny. Countless articles have been published on the many ways in which metrics such as the QALY, DALY, or WELLBY fail to adequately capture the complexities of the phenomena they try to model and the inequalities that they may perpetuate or even exacerbate (see, e.g., Schroeder (2017) for an overview of potentially unjustified value choices behind the DALY, including age-weighting, time-discounting, life expectancy valuations between genders, and co-morbidity, among others; Bishop and Herron (2015), and Kero and Lee (2016) calling for ordinal interpretations of Likert scales; and Larroulet Philippi (2024) challenging linearity assumptions).

While such critique is crucial in improving wellbeing frameworks, we acknowledge that policymakers face practical constraints that require pragmatic compromises. Recognising the inherent difficulty of capturing so much complexity in a single metric, we can instead ask which simplifications have the largest impact on wellbeing, quality of life, and sustainability, and address those strategically. This paper proposes the following approach for such prioritisation: recognise, quantify, and alleviate the most extreme forms of suffering.

Concern for the alleviation of extreme suffering is near universal across ethical traditions. Whether it is caused by illness or inflicted by others (e.g., in instances of torture), we recognise that nobody should have to undergo extreme agony. As Lohman et al. (2010) point out, several international bodies and the medical community have reached consensus on the topic of pain relief. Human Rights Watch considers failure to provide adequate pain relief a human rights violation (Lohman et al. 2010). Among others, the World Health Organization considers the use of strong opioids “absolutely necessary” to treat moderate to severe pain (WHO 2000); the UN’s Single Convention on Narcotic Drugs mandates the adequate provision of narcotics in medicine (UN 1961); and the UN Economic and Social Council has also urged nations to take steps to make opioid analgesics more accessible (UN 2005).

In this paper, however, instead of merely emphasising the well-documented treatment gap in pain relief (Bhadelia et al. 2019), we argue that existing metrics used in the wellbeing economy should prioritise accounting for severely painful conditions, based on two key observations: (i) recent work in psychophysics suggesting that the most extreme experiences (both positive and negative) may be subjectively experienced as orders of magnitude more intense than mild experiences, and (ii) current metrics (particularly the DALY and the WELLBY) not yet adequately capturing these extreme states, due to the inherent linearity of the weights used.

We offer a concrete case study to illustrate our argument. Cluster headache (CH) is widely recognised as one of the most (if not the most) painful conditions known to medicine (Nesbitt and Goadsby 2012), described by patients as “devilish”, “gruelling”, “unbearable” or “so violent that it is utterly intolerable” (Torelli and Manzoni 2003). Opioids are largely ineffective in treating the pain of cluster headache attacks (Pearson et al. 2019). And while it affects just as many people as multiple sclerosis (with a lifetime prevalence of 1/1000 adults, Fischera et al. 2008) and is enormously debilitating (Rossi et al. 2018), cluster headache remains heavily underfunded and understudied by comparison.

Recognising that wellbeing metrics are typically supplemented by broader discussions as opposed to being applied technocratically (see related discussions in the context of palliative care in Wichmann et al. 2020), this paper urges policymakers in wellbeing economies to prioritise discussions of extreme pain.

As an illustrative example, we also estimate the burden of pain from multiple sclerosis and compare it to that of CH. Additionally, we gather data from various funding databases to estimate how much research investment has gone into cluster headache versus multiple sclerosis in recent decades in the UK. Our findings clearly suggest that cluster headache is underprioritised not only relative to the severity of its burden, but also relative to similarly prevalent conditions. A brief analysis of drivers suggests the underprioritisation of cluster headache may be even larger globally than in our UK data.

Cluster headache

Cluster headache (sometimes also referred to as “suicide headache”) is a disorder characterised by attacks of severe pain behind the eye, typically lasting between 15 min and 3 h (Black et al. 2016). Each attack has a very sudden onset, starting as mild discomfort that reaches maximal intensity on average within 9 min in 86% of sufferers (Torelli and Manzoni 2003). Patients often describe the pain as similar to being repeatedly stabbed in the eye with a knife (Rossi et al. 2018; Schindler and Burish 2022). The term “cluster” refers to the fact that attacks come in bouts typically lasting 1–12 weeks, often seasonally (e.g. in the fall), followed by periods of remission lasting a few months to a few years for patients of the episodic type (Ekbom 1970). During a cluster bout, patients experience, on average, 3–4 attacks per day at predictable times of the day (Gaul et al. 2012; Burish et al. 2021), with a long tail extending to 10 attacks or more per day (Sewell et al. 2006; Gómez-Emilsson 2019). Up to 20% of CH sufferers are of the chronic subtype, with remission periods shorter than 3 months (Schindler and Burish 2022). Some chronic patients experience daily attacks with no remission.

A recent international survey of 1604 CH patients revealed new data about how sufferers rate their pain compared to other severely painful conditions (Burish et al. 2021). Notably, respondents rated CH pain as 9.7 ± 0.6 on a 0–10 scale, with 72% (n = 1157) rating it as 10.0. Respondents also rated other severely painful conditions they had experienced: Labour pain (n = 308) was rated as 7.2, kidney stones (n = 239) as 6.9, gunshot wounds (n = 25) as 6.0, migraine attacks (n = 663) as 5.4, and bone fractures (n = 868) as 5.2, among others.

Treatment options for CH are limited and only partly effective. In the survey above, the most commonly used treatments were triptans (70%, n = 1119), oxygen (67%, n = 1082), and opioids (34%, n = 541). Triptans were considered “completely effective” or “very effective” by 14% and 39% of those who use them, respectively; for oxygen, the numbers were 13% and 41%; and for opioids, only 1% and 4% (Pearson et al. 2019). Standard painkillers that can be effective against migraine and other types of headaches are ineffective against CH (Nesbitt and Goadsby 2012).

Indoleamines, and in particular tryptamines such as psilocybin or LSD, show promise to be highly effective in both aborting attacks and helping patients go into remission, often indefinitely, after only a few low doses (Schindler and Burish 2022). The tryptamine N,N-DMT has also been suggested as a potentially effective treatment due to its very fast action and requiring very low, sub-hallucinogenic doses (Andersson et al. 2017; Frerichs 2019). However, patients face major challenges accessing tryptamines due to their legal status. Only very recently (in June 2024), Canada approved the first psilocybin treatment for CH (Pope 2024).

There is still relatively little awareness of CH among the general population and even among medical specialists, despite being widely recognised as one of the most painful conditions known to medicine and having a similar lifetime prevalence (1/1000) (Fischera et al. 2008) to multiple sclerosis (0.9/1000) (Leray et al. 2016) and Parkinson’s disease (1–3/1000) (Elbaz et al. 2016). Delays in the diagnosis and misdiagnosis of CH have been well documented even in countries with well-developed health services (Buture et al. 2019).

The consequences for CH patients are devastating: a survey of patients primarily in the US, UK, and Canada revealed that those who use oxygen as an abortive treatment, 13% needed 2–5 years to obtain it after their diagnosis, 15% obtained it within 1–2 years, 11% within 6–12 months, 25% within 1–6 months, and only 36% within 1 month (Pearson et al. 2019). The same survey recorded an average difference of six years between the age of onset (28.2 ± 12.2) and the age of first diagnosis (34.2 ± 11.8) (which is a combination of patient delay in seeking help and clinician delay in diagnosing) (Burish et al. 2021). A survey of general practitioners in the UK confirmed persistent difficulties in diagnosing and managing CH (Buture et al. 2020), leading to months or years of excruciating suffering that goes untreated.

Non-linearities in mapping pain experiences

Various lines of evidence suggest that reports of pain (and pleasure) may scale non-linearly with the actual experiences, especially at the extremes. More concretely, the difference in subjective intensity between, say, 9/10 and 10/10 pain is experienced as much larger than between 6/10 and 7/10. In other words, the human capacity for pain is much larger than might be intuited by those extrapolating from personal experiences within the range of mild and moderate pain experiences. Indeed, certain specialised, extreme pain scales are explicitly logarithmic. On the 1–10 KIP scale developed by the cluster headache community to rate the severity of attacks, a 10/10 attack is considered ten times (and not two times) more painful than a 5/10 attack (Clusterbusters.org 2020). Similarly, on the 1–4 Schmidt Sting Index, an insect sting rated as 4/4 is considered to hurt 10 times more than a 3/4 sting, which in turn hurts 10 times more than a 2/4 sting, and so on (PBS 2018). Such non-linear mapping in turn suggests that the proportion of total pain concentrated in extreme suffering is higher than would be implied by a linear interpretation of a 0–10 scale and the relative infrequency of extreme reports.

A survey by Gómez-Emilsson and Percy (2023) lends additional support to the notion that our capacity to experience pain spans a broad range, i.e., at least two orders of magnitude, which they call the heavy-tailed valence hypothesis (HTV hypothesis) and supports a heavy-tailed distribution of valenced experiences. They found that over 50% of the 97 respondents reported their most intense experiences as at least twice as intense as the second most intense, despite the experiences’ individual scores typically being fairly close together when scored on a 0–10 scale. Simulations further demonstrated that the reported data were more likely to occur if underlying experiences were distributed using a heavy-tailed (lognormal) distribution rather than a thin-tailed distribution (normal).

The findings from Gómez-Emilsson and Percy (2023) on a 0–10 self-report scale reinforce earlier evidence from Kersten et al. (2014) on the use of pain visual analogue scales (VAS), who also found evidence of non-linearities in the pain reports of 221 patients with chronic stable joint pain. Specifically, changes in pain VAS scores at the extreme ends involving few raw score points would normally reflect considerable metric change in interval data generated from Rasch analysis. In the middle of the pain VAS scoring range, by contrast, gaining many raw score points would typically reflect little change on the metric. Salaffi et al. (2004) similarly identified that smaller reductions in chronic musculoskeletal pain intensity were sufficient to qualify as minimal clinically important differences among those patients whose baseline pain had been higher.

Additional evidence in support of the HTV hypothesis may be found in neurological studies observing power law behaviours at different brain scales (which may correlate with the subjective intensity of an experience), including during neuronal avalanches (Klaus et al. 2011), in spike counts (Teich et al. 1997), and in ion channel fluctuations (Toib et al. 1998). Similarly, observations that people’s willingness to pay to avoid pain grows non-linearly with pain intensity (Slimani et al. 2022) may be viewed as indirect evidence supporting the HTV hypothesis. These claims might also be partially reconcilable with studies arguing for a linear interpretation of pain scales (Plant 2020), noting that such studies are focused on the majority of day-to-day experiences rather than outlier events. Specifically, linearity might apply approximately in the mild and moderate range (e.g., empirical data from Myles et al. 1999) but increasingly gives way to a non-linear relationship around 9/10.

If true, the HTV hypothesis would have several important implications. It would suggest that a large fraction of human suffering is concentrated at the extremes, perhaps in a small fraction of all sufferers. It would also introduce a framework for quantifying this inequality and provide a basis for our intuitive concern for extreme suffering.

Health metrics

Widely used health metrics such as the DALY are currently insensitive to these vast differences in our capacity to experience pain and pleasure. The DALY is calculated as the sum of years of life lost due to premature death (YLLs) and years of healthy life lost due to disability (YLDs), i.e., DALY = YLL + YLD. Since headache disorders are not considered a cause of death according to the Global Burden of Disease (GDB), YLL is zero and DALY = YLD (Stovner et al. 2018). Headache YLDs are then calculated as the product of the number of people with headaches worldwide, the average time spent with headaches, and, crucially, a 0–1 weight measuring the degree of disability caused by headaches, where 0 = full health and 1 = death. For instance, the disability weight for migraine according to the GBD is 0.44, whereas CH is not included at all (Institute for Health Metrics and Evaluation (IHME) 2021). As Stovner et al. (2007) point out: “Although other headache disorders such as cluster headache undoubtedly impose a great burden on individual patients, the total societal burden of this and other severe but relatively rare headaches is probably quite small compared with that of the common headache types”.

More work is needed to determine whether existing metrics could better capture extremely negative experiences. For instance, some QALYs are estimated using time trade-off (TTO) or standard gamble (SG) methods, but many challenges remain, as even sophisticated methods struggle to adequately represent the full intensity of extremely negative experiences without transforming or censoring the negative values they produce (Attema et al. 2013). Additionally, trade-off surveys would either need to be answered by patients who have actually experienced the most negative states or use well-crafted techniques to evoke the right responses (e.g., asking respondents to imagine specific torture scenarios, potentially outside of the typical anodyne survey-answering mindset), since broad intuitions about such states may systematically underestimate their severity. The difference between lived experience and the assumed severity by those without experience can be dramatic, as in comparing the CH reports from Burish et al. (2021) discussed above and the IHME (2021). Insights from the HTV hypothesis also point to the importance of triangulating TTO estimates between multiple conditions, where the respondent has lived experience of each condition being compared.

If existing metrics cannot be used to capture extreme experiences, additional metrics could be used. Leighton (2023) argues that the Wellbeing-Adjusted Life-Year (WELLBY) (Frijters et al. 2024) and the Suffering Intensity-Adjusted Life-Year (SALY) (Knaul et al. 2018) are steps in the right direction but do not suffice, suggesting instead two potential metrics to supplement discussions of welfare economics and public policy: Years Lived with Severe Suffering (YLSS), capturing suffering at the level of approximately 7/10 and above; and Days Lived with Extreme Suffering (DLES), for suffering of approximately 9/10 and above.

The central argument of this paper is that widely acknowledged health inequalities are even more severe when taking the heavy-tailed valence hypothesis into consideration. We argue that CH causes a large burden of extreme suffering, yet its impact remains greatly underappreciated and under-addressed, especially compared to similarly prevalent conditions receiving significantly more attention and funding. A fair and progressive wellbeing economy should treat such conditions with the urgency and seriousness they deserve, in ways analogous to efforts to prevent torture or ensure universal access to opioids in terminal cancer or anaesthesia in surgical procedures, all of which involve comparable levels of suffering that society has rightly judged to be morally urgent to address.

At the same time, we offer an optimistic picture: the economic investment required to significantly reduce the burden of extreme suffering from CH is modest compared to other burdensome conditions, given (i) the scalability of existing treatments, (ii) the promise of emerging, low-cost treatments, and (iii) the relatively low prevalence of CH, meaning that treatments can be concentrated among the few intense sufferers who most need them.

Methods

The burden of pain from cluster headache

We are interested in estimating how much time is spent each year globally on CH pain at different levels of intensity (on the 0–10 pain scale). This involves estimating the annual prevalence as well as the frequency, duration, and intensity of attacks, each of which follows some probability distribution.

$${Burden}({intensity})={Prevalence}\,x\,{Frequency}\,x\,{Duration}\,x\,{Intensity}$$
(1)

We calculate this distribution for four different types of CH patients, depending on whether they are chronic vs episodic, and whether they have access to treatment (either preventative or abortive) or not. This distinction allows us to see how the burden is distributed among the groups and potentially prioritise them accordingly.

We aggregate statistical data from the available literature and run Monte Carlo simulations to arrive at the resulting distributions. All calculations were done using the statistical libraries in SciPy 1.15.2 and NumPy 2.2.4 with Python 3.12.2, selected for their comprehensive support of statistical distributions, ease of use, and compatibility with the web-based visualisation package Streamlit 1.45.1.

Prevalence

Level of confidence in the data: medium

The literature presents a somewhat complex picture of CH prevalence, with variation in reported prevalence rates. Individual studies have reported 1-year prevalence rates as low as 0 (in Malaysia) and 32 (in Ethiopia) per 100,000, and as high as 150 per 100,000 (in Germany). A meta-analysis by Fischera et al. (2008) of all available epidemiological studies estimated the 1-year worldwide prevalence at 53 per 100,000 (95% confidence interval: 26–95), with the lifetime prevalence being 124 per 100,000 (95% CI: 101–151) for adults of all ages and sexes. Our simulations use 5,728,759,000 adults (ages 18+) worldwide, as of 1 Jan 2024 (UN 2024). This would mean that 3.03 million adults worldwide will suffer from CH in a given year (95% CI: 1.49 million–5.44 million). Minors are excluded due to insufficient prevalence data. The authors also did not specify whether they controlled for the age pyramid in each country, which could also affect the prevalence calculations. They also note that CH might be less frequent in developing countries, which more recent studies from previously understudied regions seem to confirm (Kim et al. 2023). This could bring the figure of 53 per 100,000 down slightly.

Our simulations use the default figure of 53 per 100,000 and a fraction of chronic patients of 20% (80% episodic), based on the meta-analysis by Schindler and Burish (2022). We simulate 0.1% of the total global adult patient population (3036 individuals) and extrapolate the results to estimate the total burden globally and in the UK.

Attack frequency

Level of confidence in the data: high

Multiple studies have documented the frequency of CH attacks. For episodic patients who typically experience annual bouts lasting a few weeks, ample data exist on the frequency and duration of the bouts.

  • Bout frequency (episodic patients): Data from 7 papers was aggregated (Ekbom 1970; Friedman and Mikropoulos 1958; Gaul et al. 2012; Kudrow 1980; Li et al. 2022; Manzoni et al. 1983; Sutherland and Eadie 1970), weighted proportionally to the sample size in each paper, and fitted to a discrete distribution.

  • Bout duration (episodic patients): Data from 8 papers was aggregated (Ekbom 1970; Friedman and Mikropoulos 1958; Gaul et al. 2012; Lance and Anthony 1971; Li et al. 2022; Manzoni et al. 1983; Rozen et al. 2001; Sutherland and Eadie 1970), weighted proportionally to the sample size in each paper, and fitted to a lognormal distributionFootnote 1.

  • Active days (chronic patients): Since chronic patients have attacks all year long with short or no remission periods, the days of attacks in a given year was modelled as a lognormal distribution with parameters such that (a) no patients have an improbably low number of attacks per year (e.g., fewer than 10) while (b) allowing for the possibility of a very small fraction of patients having attacks every day or almost every day.

  • Attacks per day: Gaul et al. (2012) conveniently distinguish between chronic and episodic patients in their study of 209 German patients, most of whom likely have access to treatment since they were recruited in a specialised headache clinic. The mean (standard deviation) values reported are 3.1 (2.1) for episodic patients (during bouts) and 3.3 (3.0) for chronic patients. Access to treatment can reduce the frequency of attacks, but little data exists on the effect size. For our simulations, 3.26 (2.21) was used for untreated episodic patients and 3.46 (3.15) for untreated chronic patients. Given (a) observations that patients can have days with more than 10 attacks (Sewell et al. 2006) and even up to 20 attacks (Lademann et al. 2016), and (b) that the median number of daily attacks tends to be lower than the mean (Cho et al. 2019), daily attack frequency was modelled as a lognormal distributionFootnote 2.

Attack duration

Level of confidence in the data: high

The duration of each attack has also been well documented (Russell 1981; Bahra et al. 2002; Snoer et al. 2019). Most attacks last between 15 min and 3 h, with 4–13% of patients reporting attacks longer than 3 h (Kim et al. 2023). Additionally, the onset and offset of each attack are quite sudden, reaching peak intensity within just 9 min on average and subsiding similarly quickly (Torelli and Manzoni 2003).

Less well documented is how the duration of an attack depends on factors such as the patient subtype (episodic or chronic), whether patients have access to treatment (either preventative or abortive), or the intensity of the attack. In our model, we start with typical durations for episodic patients without treatment (lognormal distribution with median: 54 min, std: 32 min, 95% CI: 20–144) and incorporate the following adjustments:

  • Attacks seem to be slightly longer for chronic patients (median: 70 min, std: 42 min) (Cho et al. 2019; Snoer et al. 2018).

  • Access to treatment can reduce the attack duration (median duration ca. 20 min shorter) (Gaul et al. 2012)Footnote 3.

  • More painful attacks tend to be longer (scaling linearly as 0.1064 × intensity + 0.5797) (Hagedorn et al. 2019).

Attack intensity

Level of confidence in the data: medium-low

We are interested in modelling the distribution of time spent at different pain intensities on the 0–10 scale, using gradation steps of 0.1. However, most studies rely on interviews or surveys asking patients about their attacks, generally as opposed to relying on more accurate headache diary data. We found three studies asking patients to record attack intensity in a diary. The results are summarised in Table 1.

Table 1 Data from prospective studies of CH pain intensity.

To model the pain distribution for untreated attacks, we fitted a truncated (at 10/10) normal distribution with the combined data from Russell (1981) and Torelli and Manzoni (2003). The truncation introduces some artifacts into our calculations, which we address in a later section, but we limit the scale to 10 to align with established medical practice. For treated attacks, we fitted a truncated normal distribution to the Snoer et al. (2019) data. The three papers likely underestimate the frequency of mild attacks, which we account for in our sensitivity tests (below).

There is little data on the availability of treatments for cluster headache patients worldwide. Rossi et al. (2020) estimated that 47% of patients in the EU had full access to treatments, while 35% had limited access and 18% lacked access. Assuming that (a) these numbers are representative of developed countries worldwide, (b) ~15% of the world population lives in developed regions, ~25% in intermediate regions, and 60% in developing regions, (c) patients in intermediate regions have ~25% complete access, ~40% restricted access, and ~35% lacking access, while (d) patients in developing regions have ~10% complete access, ~30% restricted access, and ~60% lacking access, and (e) ~50% of patients with “restricted” access have access to treatment while ~15% of patients with “lacking” access have some access to treatment, we arrive at a global estimate of 43%. Given our uncertainty on these data, we performed a sensitivity analysis with a pessimistic estimate of 25% global treatment access and an optimistic estimate of 60%.

Finally, to model the time profile of each single attack, we assume that the onset and offset each last 15% of the total attack time, and the remaining 70% is spent at the maximum pain intensity (based on Torelli and Manzoni 2003). While this simplification ignores fluctuations in pain, we believe it is a good approximation given observations by Nesbitt and Goadsby (2012)Footnote 4 and Ekbom (1975)Footnote 5.

To incorporate the HTV hypothesis into the calculations, we need a principled way to allow more intense pain to be weighed more heavily relative to mild pain. One way to do so is to map the linear pain scale to a corresponding weight (say, between 0–1) non-linearly. Figure 1 illustrates three possible ways to map the 0–10 scale to a 0–1 weight: Using linear weights (~x), using power weights (~xp), and using exponential weights (~bax). We note that mapping the pain scale linearly to a 0–1 weight has been proposed by Stovner et al. (2007) to approximate the DALY disability weight of different headache conditions, but that this approach would not capture the HTV hypothesis.

Fig. 1: Examples of transformations for the intensity weight.
Fig. 1: Examples of transformations for the intensity weight.
Full size image

Linear (~x), power (~xp, here with p = 2) and exponential (~bax, here with b = e, a = 1) weight transformations for the 0–10 pain scale.

Sensitivity analysis

Given the large number of variables feeding into our model and the uncertainty from the lack of published data, we carried out an additional sensitivity analysis for four key variables determining the total burden, using low, median, and high estimates for each: (i) global prevalence (26, 53, and 95 per 100,000), (ii) fraction of chronic patients (15%, 20%, 25%), (iii) fraction of patients with access to treatment (25%, 43%, and 60%), and (iv) mean attack intensity (20%, 10%, and 0% lower than reported in the three intensity studies above).

Research investment in cluster headache in the UK

As an illustrative case study, we assessed the level of investment in CH research in the UK, aggregating data from the following sources:

  1. 1.

    The “Gateway to Research” database, published by UK Research and Innovation (UKRI), lists publicly funded research projects.

  2. 2.

    The “Funding and Awards” database of the National Institute for Health and Care Research (NIHR).

  3. 3.

    The grants database of the “Wellcome Trust”, the UK’s largest non-governmental source of scientific research funding.

  4. 4.

    The NIH database of clinical trials (UK trials only).

For each, the search terms “cluster headache(s)” and “multiple sclerosis” were used, including results for the earliest dates available.

As additional supporting evidence, we gathered data on the 2023 income and expenditure of the largest nonprofit organisations in the UK focused either on CH or multiple sclerosis, using financial figures reported to the UK Charity Commission.

Results

Burden of cluster headache pain

Figure 2 summarises our main results using the median estimates defined in the sensitivity analysis section. On average—and as expected—patients of the chronic subtype without access to treatment will spend the most time in CH pain annually—570 h per year, compared to 96 h for the average episodic patient with access to treatment (Fig. 2a). However, since the episodic subtype is ~4× more common than the chronic subtype, the total global burden is dominated by the episodic untreated group (Fig. 2b and c). We estimate that adult CH patients worldwide spend 70,666 person-years per year in pain at any pain intensity, 35,489 at ≥7/10 intensity, and 8569 at ≥9/10 intensity (Fig. 2d). Using the terminology introduced by Leighton (2023), the figure of 35,489 person-years would correspond to the Years Lived with Severe Suffering (YLSS) for CH. The Days Lived with Extreme Suffering (DLES) would then be 8569 years × 365 days/year = 3,127,855.

Fig. 2: Global burden of cluster headache pain.
Fig. 2: Global burden of cluster headache pain.
Full size image

a Average hours per year that a patient in each of the four groups spends experiencing cluster headache pain, at different intensities on the 0–10 pain scale (in steps of 0.1). b Total number of person-years of cluster headache pain experienced annually worldwide by all patients in the four groups, at different intensities on the 0–10 pain scale. c Total amount of cluster headache time (person-years) spent annually worldwide at any pain intensity, per patient group. d Aggregation of all time (person-years) spent in cluster headache pain worldwide by all patients at: any intensity, ≥7/10 intensity, and ≥9/10 intensity.

Translated to the UK population in 2023 (~67 million inhabitants, of which ~80% adults, Gov.UK 2023), we arrive at 661 person-years at any pain intensity, 332 person-years at ≥7/10 pain, and 80 person-years at ≥9/10 pain (29,200 DLES). Under the simplified assumption that an attack lasts 1 h on average, this means that 8569 adults worldwide and 80 adults in the UK are experiencing agonising (≥9/10) CH pain at any given point in time.

Table 2 shows the results of our sensitivity analysis. The burden calculations are particularly sensitive to the prevalence figure used and the exact distribution of pain intensity across attacks. With the most conservative choice of all four parameters, the global burden is as low as 689,693 DLES. In the most liberal case, the burden is as high as 11,037,539 DLES.

Table 2 Sensitivity analysis for the DLES burden as a function of four key variables, relative to the median burden of 3,127,854 DLES.

We can now compare the burden of pain from CH with that of MS (we will say more on the overall life burden). While modelling the precise global distribution of pain for MS patients is beyond the scope of this paper, we can arrive at a reasonable estimate with a few parameters from the literature. The global prevalence of MS is estimated at 37 per 100,000 (Atlas of MS 2024). The average time spent in pain is unclear. Up to 85% of MS patients fall under the category of Relapsing-Remitting MS (Hill et al. 2021), meaning they have extended periods of partial or complete recovery, but a survey of 1672 MS patients revealed that 43% experienced pain at the time of evaluation (Solaro et al. 2004). Another survey revealed that 40% of patients with pain report having constant pain (Warnell 1991). In terms of pain intensity, a survey by Svendsen et al. (2003) reported that, of MS patients experiencing pain, 61% (n = 282) rated it as “mild”, 35% (n = 162) rated it as “moderate”, and 4% (n = 19) rated it as “severe”. When asked to describe pain intensity “at its least”, the median response was 20 mm (IQR: 6–38) on the visual analogue scale, and the pain “at its worst” received a median score of 68 mm (IQR: 46–85). As with CH, we expect survey responses to overestimate the actual intensity of each pain episode.

Figure 3a shows an illustrative comparison of the global burden of pain of MS and CH. It makes the simplifying assumption that MS patients spend 25% of their time in pain, the bulk of which is spent in mild pain, with a tail extending to pain values close to 10/10, especially to account for the ~3% of MS patients who develop trigeminal neuralgia (Houshi et al. 2022), which can be particularly painful. Figure 3b shows the intensity-adjusted pain burden for both conditions, assuming that more intense pain is weighted more heavily than mild pain (in this example, using a power transformation proportional to x2, i.e., giving 10/10 pain a weight of 1.0, 9/10 pain a weight of 0.81, 8/10 pain a weight of 0.64, etc.). Figure 3c shows the weight transformation for which the intensity-adjusted global burden of pain from CH exceeds that of MS (close to x2 for this choice of distribution parameters). Such a transformation is very modest, given the extreme descriptions of the relative intensity of CH pain compared to other very painful experiences.

Fig. 3: Comparison of burden of CH pain and MS pain.
Fig. 3: Comparison of burden of CH pain and MS pain.
Full size image

a Global burden (total number of annual person-years) for cluster headache and multiple sclerosis, assuming a skew-normal distribution for MS pain with median = 3.5, mean = 2.0, and SD = 1.8. b Intensity-adjusted global burden assuming the pain is weighted exponentially (proportional to x2). c Intensity weight transformation for which the total global pain burden of CH is larger than that of MS (approximately ~x2).

The prevalence of MS in the UK is roughly 5.3× higher than the global average: 199 per 100,00 (Atlas of MS 2024). To our knowledge, no data exists regarding the relative prevalence of CH in the UK compared to the global average. Some data weakly suggests that prevalence might be higher in more northern countries (Fischera et al. 2008), but even if this were not the case, the HTV hypothesis would still suggest increasing priority for CH, given the regular occurrence of ≥9/10 pain among patients.

As noted earlier, though, headache conditions do not have a YLL component in DALY calculations, unlike MS, where YLLs contribute 50% of the total DALY value globally (489k YLLs and 484k YLDs in 2021) (Global Burden of Disease Collaborative Network 2024).

Finally, one might debate whether aggregating different pain intensities into a single metric—however heavily one decides to weigh extreme pain—is necessary or philosophically sound. Indeed, under minimalist axiologies or lexical ethical views (Ajantaival 2024), one might decide to focus one’s efforts on reducing the most extreme forms of suffering (say, involving pain at a level of 9/10 or higher) regardless of whether there also exist other large amounts of milder suffering. Under this lens, relieving cluster headache suffering becomes a top priority.

Ceiling effects

At least three major challenges exist when trying to quantify the distribution of pain intensity of any condition:

  1. 1.

    Patients with different pain conditions might interpret the scale differently (such as migraine patients who have never experienced CH pain). For example, to emphasise the severity of their pain relative to other conditions or previous attacks, CH patients sometimes report numbers beyond 10.

  2. 2.

    The distribution of answers to the question “How intense is pain from CH?” will differ from the distribution of time actually spent at different pain intensities (usually captured in headache diaries). Data on the former is easier to find.

  3. 3.

    Individual patients themselves might have difficulties assessing the severity of each attack, for instance, due to recall bias or because what they thought was a 10/10 gets surpassed in future, even more severe attacks.

In each case, there is a ceiling effect that distorts the true underlying distributions. A useful analogy would be asking university students to take an elementary school test and getting an average score of 97/100. That average should not be interpreted as “slightly below 100” but rather as “much higher than what the test can measure, plus some room for minor mistakes”.

A prediction of ceiling effects is that one should expect many values to cluster at the scale maximum, such as in the Burish et al. (2021) survey (and to some extent in the prospective reports from Table 1).

While a full treatment of this problem goes beyond the scope of this paper, we can hint at potential approaches. For instance, one could assume an underlying normal distribution that extends beyond 10/10. Then, using statistical techniques (such as the tobit model), one could try to reconstruct the underlying distribution using the measured data (Fig. 4).

Fig. 4: Illustration of the ceiling effect when measuring pain intensity.
Fig. 4: Illustration of the ceiling effect when measuring pain intensity.
Full size image

The blue histogram represents responses by patients (either all-things-considered or per attack), with a clustering of responses at the ceiling of 10/10. The orange curve represents the theoretical underlying distribution of pain intensity. The green curve is an attempt to reconstruct the theoretical curve using the measured data.

Incorporating ceiling effects would allow for more fair comparisons across conditions and a more accurate calculation of pain burden for extremely painful conditions. Indeed, we expect that our intensity estimates (based on a truncated normal distribution) lead to underestimating the total burden calculations—and it is precisely the behaviour at the far end of the scale that we argue has the most weight. The overall case for prioritising CH would be strengthened even further.

Research investment

Table 3 summarises the results of our search for UK grants and clinical trials on CH and MS. Support for CH research has been vanishingly small in the past 30 years. The earliest clinical trials on CH in the UK reported by the NIH began in 2011.

Table 3 Research investment in the UK for cluster headache and multiple sclerosis.

The disparity in charitable funding between MS and CH is also notable (Table 4). The difference in financial resources between the main charities focussing on MS and CH (MS Society and OUCH UK, respectively) is thousand-fold. Even the combined resources of other major charitable organisations working on headache and migraine (which may dedicate a fraction of their efforts to CH) fall short of those working on MS.

Table 4 2023 income and expenditure of some of the largest UK charities working either on multiple sclerosis or cluster headache (or headache more generally).

The underinvestment in CH is also part of a well-documented pattern of underinvestment in headache conditions more generally (Orr and Shapiro 2022; Shapiro and Goadsby 2007). Olesen et al. (2007) estimated that, relative to its economic cost in Europe, migraine receives the least amount of public funding of all brain disorders. Similarly, Schwedt and Shapiro (2009) concluded that NIH funding for headache disorders in 2007 was approximately ten to sixty times smaller than what it should have been if funding had been allocated proportionally to DALY burden ($103–634 million as opposed to $13 million). In 2019, the gap continued to be significant: $28 million spent vs $360 million estimated (Orr and Shapiro 2022). Our database search confirmed that migraine research has received significantly less support in the UK than MS: 36 migraine-related projects have received £18m from UKRI since 2006; 25 projects (5 active) have received £14m from NIHR since 1998; 20 projects mentioning migraine have received £5m from the Wellcome Trust since 2006; and 75 clinical trials on migraine have been reported by the NIH since 2003.

Without a doubt, research investment into potentially the most painful condition known to medicine is disproportionately low, given the burden it causes.

Cost estimates

Thompson et al. (2017) estimated the annual costs of all healthcare and related resources to treat MS in the UK at £11,400 for patients with mild MS (18% of patients), £22,700 for moderate MS (51% of patients), and £36,500 for severe MS (31% of patients), resulting in a weighted average of £24,800 per patient per year.

To our knowledge, analogous data has not been estimated for CH in the UK, but a 2020 study of Italian CH patients by Negro et al. (2020) estimated the direct and indirect treatment costs related to chronic CH at €13,350 (£11,866) annually and €2,487 (£2210) per bout of episodic CH (each bout lasting 6.8 ± 5.1 weeks). Direct costs accounted for 72% of the total vs 28% for indirect costs. The largest contributors to direct costs were acute medications, of which oxygen and sumatriptan injections were the most common (used by 83% and 68% of patients, respectively): €8314 ± €6822 (£7390 ± £6064) for chronic patients and €1387 ± €1938 (£1233 ± £1723) for episodic patients. Costs for preventative medication (primarily verapamil) were negligible at €102 ± €138 (£91 ± £123).

In the UK, a single 6 mg sumatriptan injection (which can abort one attack for patients who respond to it) costs £25 (NHS 2024). Oxygen costs are not readily available in the UK, but O’Brien et al. (2017) estimated that the annual cost of high-flow oxygen was less than $1000 for episodic CH patients and less than $5000 for chronic CH patients in most US states (ca. £1205 and £5130 today respectively, adjusting for inflation). Our simulations show that chronic patients get an average of approximately 475 attacks per year, compared to 180 attacks per year for episodic patients. So, providing universal access to sumatriptan and oxygen would cost approximately £17,005 per chronic patient and £5705 per episodic patient per year (a weighted average of £7965 per patient, assuming 20% chronic and 80% episodic). We estimate that such an investment would reduce the annual DLES by about 46%.

However, as mentioned earlier, many patients do not respond to either sumatriptan or oxygen. Additionally, continued use of sumatriptan can even cause an increase in the frequency and pain intensity of attacks in some patients (Clusterbusters.org 2023), and a maximum of only two 6 mg injections per day is recommended (Leone and Proietti Cecchini 2016). This underscores the urgent need to (i) allow clinicians to prescribe emerging treatments such as indoleamines (Schindler et al. 2015), and (ii) significantly increase funding for research into such emerging treatments, which are likely to be even less costly per attack aborted than existing treatments.

While such cost-effectiveness estimates are a key element of any modern wellbeing economy, it bears noting that society has already established a clear precedent for preventing pain of this magnitude: we already provide anaesthesia for major surgery as well as post-operative acute pain care universally. We argue that the ethical imperative for treating CH warrants similar consideration, especially since the low prevalence of the condition means that the overall financial investment would be modest in the context of overall healthcare expenditure.

Discussion

Our case study of CH not only confirms the underinvestment of headache disorders discussed in the literature, but reveals even more pronounced inequalities when considering the severity of the suffering implied by the HTV hypothesis and the relative investment into similarly prevalent conditions. Any economy claiming to care for the wellbeing of its citizens must prioritise allocating enough resources to lift those who suffer most, a core value shared across most ethical frameworks, both religious and secular. It also must not be biased by “disease prestige” (Stone 2018), which may also be a contributing factor behind the MS/CH inequality we have quantified. This market failure offers a promising opportunity: a significant amount of extreme pain could be eliminated with a relatively low investment, presenting any wellbeing-focused economy with an exceptionally cost-effective opportunity.

We estimate that global underinvestment in CH is equally or more severe than revealed in our analysis of UK investments. Mental health conditions typically receive far less attention and resourcing in lower-income countries than physical conditions with more visible symptoms, which would typically translate into less attention for conditions such as CH relative to MS. For instance, the Mental Health Atlas 2020 (WHO 2021) identifies 1.1% of general government health expenditure on mental health in low-income countries compared to 3.8% in high-income countries. This discrepancy, exacerbated in lower-income countries, receives very critical discussion in a Lancet Commission report (Patel et al. 2018). The Neurology Atlas (WHO 2017) identifies discrepancies in the neurological workforce scale that we estimate would foster underdiagnosis and reduced awareness of CH as distinct from other headache conditions: 0.1 workforce per 100,000 in low-income countries compared to 7.1 in high-income countries. Even if prevalence might be higher in the UK than globally, discrepancies in treatment access are such that the global burden is likely to remain high. Finally, the vast majority of countries worldwide do not have an organised group such as the UK’s OUCH UK to raise awareness and advocate for CH. Possible trends in the reverse direction include the possibility that more privatised healthcare systems are more responsive to patient-reported pain, and such systems are more common outside the UK, but this is unlikely to outweigh overall resourcing discrepancies in low- and middle-income countries. It is also noteworthy that while global financial investment cannot be much lower (given the low basis in the UK), research and clinical trials activity can be. On balance, global underinvestment seems likely to be at least as severe as in the UK.

Our findings point to a few recommendations to develop a more equitable wellbeing economy. Short term, we suggest (i) providing universal access to existing treatments to all CH patients; (ii) increasing research funding for CH, commensurate with the enormous burden it inflicts; (iii) fast-tracking research into promising emerging treatments, especially for indoleamines; (iv) supporting programmes to improve diagnosis rates among medical professionals; and (v) supporting the few advocacy groups working to raise awareness for CH and other extremely painful conditions.

Longer term and more systemically, we suggest (i) developing new health metrics (or extending existing ones) to account for the HTV hypothesis; (ii) increasing the incentives for research into extreme pain; and (iii) reforming how healthcare resources are allocated to explicitly consider the burden of extreme pain. Inspiration can be drawn from analogous debates in palliative care, where it has been acknowledged that the QALY alone does not capture the full complexity of end-of-life care, leading to proposals for new frameworks that complement existing metrics (Wichmann et al. 2020).

In the context of health metrics, our study focuses on extreme pain points towards measurement techniques with 100 or 1000 gradations rather than 10, and provides vivid examples that invoke extreme experiences rather than the bland language that can be common in surveys. Importantly, policy conclusions around healthcare investments should be tested for robustness to non-linearities or kinked extremes in the mapping between pain severity and reported pain, particularly on 0–10 scales. These modest proposals could also be supplemented with an ambitious research programme to test and quantify possible such non-linearities, both in the capacity of human experience and in specific conditions. For instance, such research might develop TTO comparisons between conditions where respondents have typical lived experience of all conditions being compared, with appropriate adjustments for recency/saliency and weights to adjust for degree of comparative knowledge about the conditions involved (building on the work of Burish et al. (2021) and others, including the literature on paired comparisons and discrete choice experiments).

We offer a few more recommendations for future research: (i) clarifying foundational questions on the nature of the distribution of subjective pain and pleasure at the extremes; (ii) estimating the prevalence of CH more precisely in countries where little data exists; (iii) mapping the burden of extreme pain from other conditions such as terminal cancer, trigeminal neuralgia, or kidney stones; (iv) running large-scale RCTs to evaluate the effectiveness of indoleamines to treat cluster headache (and possibly other similar conditions); and (v) investigating other major sources of discrepancy between disease burden and funding allocation.

This study has several limitations. The existing data on prevalence and access to treatment is sparse (especially in low-income countries and for minors), and our burden estimates are quite sensitive to these. The error bars could be reduced as more epidemiological data becomes available. Similarly, additional headache journal data would be valuable to better estimate the distribution of attack intensity across patients. Finally, more work on the non-linear properties of pain perception (especially at the top of the scale) would also allow for more accurate comparisons across conditions and better estimates of their relative burden.

Conclusions

Cultural biases and measurement limitations appear to have contributed to systematic underinvestment in cluster headache relief in recent decades. Despite being widely recognised as one of the most painful conditions in medicine, with a similar prevalence to multiple sclerosis, it receives orders of magnitude less research funding in the UK (and very likely globally). The resulting health inequality is further exacerbated when considering the HTV hypothesis, which suggests that the most extreme experiences might be orders of magnitude more intense than milder ones, requiring a recalibration of priorities in the wellbeing economy.

Our Monte Carlo simulations show that the global burden of extreme pain (rated as ≥9/10 in intensity) from CH in adults is about 3.1 million DLES per year, noting that the figure is sensitive to the parameters used. However, even the most conservative estimate is concerning, given the severity of the suffering. Additionally, many patients still experience difficulties in terms of diagnosis and access to treatment, even in countries with well-developed healthcare systems like the UK. The fact that emergent low-cost treatments show very promising efficacy but remain illegal and understudied is a tragic market failure.

While many challenges remain, we argue that modest marginal improvements could have a disproportionate impact on reducing the burden of extreme pain. Moving toward a true wellbeing economy involves confronting uncomfortable questions about the distribution, aetiology, and epistemology of human suffering, but we argue that just as we now recognise surgical anaesthesia as essential, providing universal access to treatment for CH (and other extremely painful conditions) warrants high priority in the wellbeing economy.