Introduction

The number of people worldwide with two or more long-term health conditions, known as multimorbidity, is rising1. In the UK, the proportion of individuals aged 65 years or older with multimorbidity is estimated to increase from 54 to 68% by 20352. In 2020, the UK Chief Medical Officers described multimorbidity as one of the greatest challenges facing the healthcare profession3. They emphasised that clinical training and practice needs to shift towards focusing on multiple, rather than individual, diseases and that a better understanding of the impact of disease clusters is necessary to improve patient outcomes3.

Multimorbidity is heterogeneous, and varies substantially between individuals. Understanding which diseases commonly cluster together and how these clusters affect mortality risk is essential for the identification of high-risk groups and effectively allocating healthcare resources3,4,5. Multimorbidity is associated with a greater risk of mortality, however, previous studies have typically only conceptualised multimorbidity either by its presence or absence, or by number of conditions, without further exploration of different clusters within multimorbidity presence6. Studies which have investigated clusters of multimorbid conditions have consisted of small sample sizes or focused on a narrow selection of health conditions with low prevalence7,8,9,10,11,12,13. A study of 113,000 primary care patient records found differential associations between clusters of multimorbidity and mortality at various ages, and there is a need to follow-up these findings in a population-based setting with detailed and standardised data collection14 Further, whilst multimorbidity has been associated with vascular and cancer mortality, these outcomes contain distinct subtypes, including diseases that are largely sex-specific, such as breast and prostate cancer15. Focusing on more granular causes of mortality will help identify whether there are differential associations between multimorbidity and pathologically distinct causes of mortality and provide insights into long-term prognosis.

In the current study, we addressed these knowledge gaps using a cohort of 500,000 participants with 44,000 deaths occurring over 16 years to investigate whether number of multimorbid conditions, and sex- and age- specific disease clusters, were associated with all-cause and cause-specific mortality.

Methods

Population

Subjects were participants in UK Biobank (UKB), a population-based cohort study which recruited half a million women and men aged 40–69 years from England, Scotland and Wales between 2006 and 201016. At recruitment, all participants attended a baseline assessment centre where they provided sociodemographic, lifestyle and medical history information through a touchscreen questionnaire and nurse-led verbal interview, and underwent physical examinations. All participants provided electronic signed consent at the assessment centre. UK Biobank received ethical approval from the National Health Service North West Centre for Research Ethics Committee (Ref: 11/NW/0382).

Multimorbidity

Participants self-reported health conditions during the baseline verbal interview, where the nurse asked participants what serious illnesses and disabilities they have been diagnosed by a doctor. To ensure standardised recording, nurses inputted answers electronically and were guided by a tree-based structure based on the International Classification of Diseases, Tenth Revision (ICD-10) coding system. Health conditions included in the multimorbidity definition were based on a list of diseases developed by Jani et al. to ascertain multimorbidity using the UK Biobank cohort (see Table S1 for list of conditions)15. Jani et al.’s selection of conditions was informed by Barnett et al.’s definition of multimorbidity which was developed to ascertain the distribution and prevalence of multimorbidity in a UK-based population in 200717. This date aligns with when health conditions were assessed between 2006 and 2010 in UKB. Multimorbidity presence was defined as having two or more of the 43 included health conditions, with multimorbidity absence defined as having zero or one condition. The number of multimorbid conditions was derived with the following categories: ‘0–1’ , ‘2’, ‘3’, ‘4’, and ‘5+’. Disease clusters were identified in participants with multimorbidity using latent class analysis, described in detail in the ‘statistical analysis’ section. The reference group for all cluster analyses were participants without multimorbidity (0–1 health conditions).

Mortality

Mortality was ascertained through linkage to electronic death registry records. NHS England provided death data for England and Wales, and the NHS Central Register; the National Records of Scotland provided death data for Scotland. The records include date of death from April 2006 to 30th November 2022, and the underlying causes of death classified using ICD-10 codes. Outcomes included all-cause mortality and cause-specific deaths based on categorisations used by the Office for National Statistics (ONS) to produce mortality statistic reports18 and grouping causes into ‘cancer mortality’ (ICD-10 chapter C), ‘vascular mortality’ (ICD-10 chapter I), and ‘other-cause mortality’ (non-cancer and non-vascular deaths).

Covariates

UKB collected information on age and sex at recruitment. Townsend deprivation score was used as an indicator of socioeconomic status and was assigned to each participant corresponding to their residential postcode at recruitment19. Participants self-reported ethnicity, educational qualifications, smoking status, alcohol intake and physical activity through the touchscreen questionnaire. Standard alcohol units (alcohol by volume equivalents) were derived based on the number of typical volume drinks for each type of alcohol consumed per week. Physical activity was assessed using questions adapted for the touchscreen questionnaire from the validated short International Physical Activity Questionnaire20. Time spent in vigorous, moderate and walking activity was weighted by the energy expended for these categories of activity, to produce total metabolic equivalent task minutes per week. Body mass index (BMI) was derived from weight (kg) using scales and standing height (metres) measured during the physical examination.

Statistical analysis

The distribution of baseline characteristics by number of multimorbid conditions were explored both unadjusted and standardised for age. Cox proportional-hazards regression models were used to assess the association between the number of multimorbidity conditions and all-cause mortality. Follow-up time in years was calculated from date of attending baseline assessment to whichever censoring date occurred first for death, loss-to follow-up, or end of follow-up. End of follow-up was defined as the last date of death registry data availability (30th November 2022). In the main model (Model A), analyses were adjusted for age, sex (Women, Men), ethnicity (White, Black, South Asian, Mixed, Other), Townsend deprivation score (in quintiles) and education (Primary, Secondary, Post-secondary non-tertiary, Tertiary). To account for lifestyle factors, the analyses were repeated (Model B) with additional adjustment for smoking status (Never, Former, Current), alcohol intake in units per week, BMI (Underweight, Normal, Overweight, Obese) and physical activity (Low, High). The proportional hazards assumption was visually assessed using scaled Schoenfeld residuals. There was no evidence that any of the variables included in the analyses violated the proportional hazards assumption. In a sensitivity analysis investigating the impact of short-term all-cause mortality, we re-ran the model excluding participants who died or were censored within the first two years of follow-up. The number of deaths per cause based on ONS categorisations were calculated, and the main analyses repeated for the association between number of multimorbid conditions and the top 10 causes of death. We repeated the main analysis for all-cause mortality stratified by sex (women, men) and age (40–59, 60–70), and plotted age-specific cumulative incidence of all-cause mortality by number of multimorbid conditions in women and men. We also repeated the analyses for cause-specific mortality accounting for the competing risk of mortality not caused by the outcome of interest using the Fine-Gray subdistribution hazard model.

Latent class analysis was used to determine disease clusters, allocating each participant with multimorbidity to a single non-overlapping cluster14,21. Clusters were estimated in women aged (1) 40–59 years, and (2) 60–70 years, and men aged (3) 40–59 years, and (4) 60–70 years. A randomly selected training data set of 80% of participants with multimorbidity was used to determine the optimal number of clusters within each group and to estimate the association of disease clusters with mortality. Statistics were generated for multiple latent class analysis models of 1 to 12 cluster solutions, and the optimal number of clusters were first determined using sample size–adjusted Bayesian Information Criteria statistics as well as capping the smallest cluster to greater than 5% of the training sample. We then used judgement and previous experience to finalise the selection of the clusters21,22. Each cluster within the four groups was assigned a label based on up to 3 health conditions with the highest probabilities of belonging to that cluster. Conditions were excluded from the labelling if the observed prevalence was equal to, or less than, that of the expected prevalence of the total population with multimorbidity; and/or their probabilities were less than 5% of contributing to the cluster. Cox proportional-hazards regression models were used to assess the association between each cluster and risk of all-cause, as well as cancer, vascular, and other-cause mortality. To assess the validity and generalisability of the determined cluster solutions, conditions from the remaining 20% of men and women with multimorbidity were entered into latent class analysis models, setting the number of clusters to match the optimal numbers from the training data set. The characterization and relative size of the clusters determined from the training and test data sets were compared, as were their associations with mortality risk in Cox proportional hazards models. Finally, mortality incidence rates (IRs) per 1,000 person-years were calculated for each cluster.

P values were 2-sided, and the type I error rate for statistical significance was set at α = 0.05. Due to multiple comparisons, a subsequent Bonferroni correction was applied within each sex- and age-specific subgroup for the multimorbidity clusters and mortality analyses. Analyses were performed using Stata SE version 17.0 (StataCorp). R package poLCA version 1.6.0.1 was used to derive disease clusters, R package riskRegression was used for competing risk analyses, and R package ggsurvfit version 1.0.0 was used to obtain the cumulative incidence plots.

Results

The final sample included 502,370 participants, of whom 165,125 (33%) had multimorbidity. As the number of multimorbid conditions increased, participants were more likely to be older, women, be of white ethnicity, live in an area of greater socioeconomic deprivation, be former or current smokers, be obese and have low physical activity levels. Little difference in alcohol intake was observed (Table S2). Findings were similar for age-standardised baseline characteristics, except for ethnicity, for which an increasing number of multimorbid conditions were observed in participants of black and South Asian, but not white, ethnicity (Table S3). The contrasting findings between unadjusted and age-standardised ethnicity is likely due to non-white participants being younger on average compared to white participants.

A total of 44,399 participants died of any cause over a median of 13 years (interquartile range = 13–14 years). The top ten causes of death were ischaemic heart disease (n = 4742), lung cancer (n = 3662), colorectal cancer (n = 2082), lymphoid and haematopoietic cancer (n = 1968), cerebrovascular disease (n = 1946), dementia (n = 1887), breast cancer (n = 1624), pancreatic cancer (n = 1612), chronic lower respiratory disease (n = 1457), and COVID-19 (n = 1416) (Table S4). For all-cause mortality, the Hazard Ratios (HR, 95% Confidence Intervals [CI]) were 1.47 (95% CI 1.43–1.50), 1.89 (95% CI 1.84–1.95), 2.38 (95% CI 2.30–2.47), and 3.14 (95% CI 3.01–3.27) for 2, 3, 4, and 5+ conditions, respectively, compared to 0–1 conditions (Fig. 1a). After excluding 2709 participants censored within 2 years of follow-up, including 2511 due to death, similar findings were observed with HRs of 1.45 (95% CI 1.42–1.49), 1.87 (95% CI 1.82–1.93), 2.32 (95% CI 2.24–2.41), and 3.10 (95% CI 3.00–3.24) for 2, 3, 4, and 5+ conditions, respectively, compared to 0–1 conditions. Dose–response associations were observed with the top 10 causes of death (Fig. 1b–k), with weaker associations observed with mortality caused by cancer, in particular colorectal and pancreatic cancer, and stronger associations observed with vascular and respiratory causes of mortality. The associations with all-cause and cause-specific mortality remained similar when adjusting for lifestyle factors (Table S5) and when accounting for the competing risk of causes of mortality not due to the outcome of interest (Table S6).

Fig. 1
figure 1

Association between number of multimorbid conditions with all-cause mortality and the top 10 primary causes of death in the UK Biobank population (adjusted for age, sex, ethnicity, Townsend deprivation index and education).

The age-specific cumulative incidence of all-cause mortality was greater for each increase in the number of multimorbid conditions, with the difference in risk emerging around 65 years of age in women, and 60 years in men (Fig. S1). Dose–response associations between number of multimorbid conditions and all-cause mortality were observed in women and men aged 40–59 and 60–70 (Fig. 2), and remained similar after adjustment for lifestyle factors (Table S7).

Fig. 2
figure 2

Association between number of multimorbid conditions with all-cause mortality by sex and age (adjusted for age, sex, ethnicity, Townsend deprivation index and education).

In the training sample, we identified six clusters as the optimal number for women aged 40–59 (n = 150,241) and 60–70 (n = 104,744) and men aged 40–59 (n = 120,382), and five clusters for men aged 60–70 (n = 93,968, Fig. S2). Clusters driven by cardiovascular and respiratory conditions were observed across all four groups (Tables S811). Clusters driven by mental health conditions were observed for both sexes aged 40–59, but not 60–70, whilst clusters driven by cancer were observed for both sexes aged 60–70, but not 40–59. Clusters driven by cardiometabolic conditions were observed for men of all ages, but not for women, whilst clusters driven by thyroid conditions were observed for women of all ages, but not for men.

All disease clusters were associated with an increased risk of all-cause mortality in women (Table 1) and men (Table 2). The strongest associations were observed in clusters driven by mental health, cancer and pain-related conditions (depression/cancer/dyspepsia; HR = 2.61, 95% CI 2.33–2.93) in women aged 40–59 and respiratory and pain-related conditions (asthma/pain/dyspepsia; HR = 2.03, 95% CI 1.90–2.17) in women aged 60–70. In men, the strongest associations were observed in clusters driven by cardiometabolic conditions for both 40–59 (diabetes/hypertension/CHD; HR = 3.43, 95% CI 3.14–3.74) and 60–70 (diabetes/hypertension/CHD; HR = 2.24, 95% CI 2.13–2.35) year olds. Associations remained similar with adjustment for lifestyle factors (Tables 1, 2). The majority of disease clusters were associated with a greater risk of cancer, vascular and other-cause mortality in women (Table S12) and men (Table S13). The associations for each cluster were generally stronger with vascular and other-cause of mortality, although as expected, clusters driven by cancer were more strongly associated with cancer mortality. There was a high degree of similarity in the characteristics of the cluster groups for men and women between the training and test samples (Tables S14 and S15).

Table 1 Association between multimorbidity clusters derived in training sample with all-cause mortality in women in UK Biobank.
Table 2 Association between multimorbidity clusters derived in training sample with all-cause mortality in men in UK Biobank.

Discussion

Main finding of this study

In this population-based cohort of half a million women and men aged 40–70 years, an increasing number of multimorbid conditions was associated with a greater risk of all-cause mortality, with the strongest associations observed for cardiovascular and respiratory causes of death. Different clusters of disease were identified in sex- and age-specific subgroups. For women, a mental health, cancer and pain-related cluster at ages 40–59, and a respiratory and pain-related conditions cluster at ages 60–70, were associated with greater risk of mortality, whilst for men, clusters of cardiometabolic conditions at all ages were associated with greater mortality risk. All associations were only slightly attenuated when accounting for lifestyle risk factors.

What is already known on this topic and what this study adds

Our finding of a dose–response association between number of multimorbid conditions and all-cause mortality is consistent with previous population-based cohort studies conducted in a diversity of settings, including the UK15,23, United States24, China25, Chile7 and Iran26. Studies on cause-specific deaths, have found that multimorbidity is strongly associated with a greater risk of cardiovascular15,25,27,28, respiratory25,28 and ‘other’ causes of death25,27, and weakly associated with deaths due to cancer15,25,27,28. However, these definitions group together different causes of death with distinct aetiologies and pathological profiles. When using more granular definitions based on ONS categorisations for monitoring UK mortality rates, we found differential associations for cause-specific deaths. Stronger associations were observed for ischemic heart disease than for cerebrovascular disease, for chronic lower respiratory disease than for COVID-19, and for breast cancer than for pancreatic and colorectal cancer. Understanding the pathways underlying these risk differences may be important for the design of effective preventative, interventional and management approaches in individuals with multimorbidity.

It is important to consider the interplay between multimorbidity and demographic characteristics, as multimorbidity prevalence increases with age and is generally higher in women29. We found similar associations between number of multimorbid conditions and mortality in women and men, but stronger associations for younger (40–59 years) than older (60–70) ages. These findings of greater relative risk at younger ages are consistent with previous studies, with one hypothesis being that early onset disease is more aggressive15,30. However, the cumulative incidence (or absolute risk) of mortality is greater by multimorbidity status at older ages, which suggests that the attenuated relative risks in this age group are due to a higher background risk of mortality in those without multimorbidity. The distinction between relative and absolute risk is important, because, if causal, the impact of multimorbidity on mortality risk is greater at an individual-level in younger age groups, and at the population-level in older age groups.

Defining multimorbidity as number of conditions can provide insights into the overall burden of living with multiple health conditions. However, understanding which diseases commonly cluster together and the impact of these clusters on future health outcomes is essential for effective clinical management and resource allocation3,4,5. Two studies derived multimorbidity patterns in the China Kadoorie Biobank, a cohort of half a million Chinese women and men aged 30–7925,31. Both found that cardiometabolic and respiratory disease clusters were strongly associated with mortality. Although our results are similar, we found that the cardiometabolic cluster was strongly associated with mortality risk in men, whilst the respiratory cluster was strongly associated with mortality risk in older women. We also found evidence that a cluster characterised by mental health, cancer and pain-related conditions was associated with mortality in younger women. Disease clusters have been shown to vary substantially by sex and age32,33. From a clinical perspective, it is unsurprising that those with cancer or vascular disease at baseline are at high risk of cancer or vascular mortality, respectively. However, these participants were also at higher risk of mortality from other diseases, reinforcing the need to consider risk factor reduction and disease treatment beyond the specific disease that may represent the most obvious risk of death.

The mechanisms driving these associations are complex and multi-factorial34. Lifestyle factors likely play a key role on the causal pathways, both by increasing multimorbidity risk as well as mediating associations between multimorbidity and mortality. We adjusted for BMI, smoking, alcohol intake and physical activity, hypothesising that if these factors confound or mediate the associations then this will substantially attenuate the observed relationships35. However, all associations were only slightly weaker, including for clusters and cause-specific deaths strongly related to these factors, such as vascular and respiratory outcomes. Two studies, both in UKB, found that participants with multimorbidity and either following a healthy lifestyle36 or with high levels of physical activity37, had a lower mortality risk compared to those following an unhealthy lifestyle or low levels of physical activity, respectively. Longitudinal studies investigating the interplay between multimorbidity and lifestyle factors throughout the life-course are necessary to identify causal pathways and critical risk periods to inform targeted interventions.

Strengths and limitations

Strengths of the study include a combination of a large sample size, cohort-wide linkage to death records, and detailed data collection. These factors enabled us to generate evidence on the association between multimorbidity with cause-specific mortality, investigate the role of sex- and age-specific disease clusters, and, account for various lifestyle factors. However, there are several limitations. First, UKB was not designed to be representative and participants are, on average, healthier than the general population38. Consequently, whilst clusters were validated using training and test sets, the identified clusters might not generalise to other populations. Second, self-report was used to ensure the ascertainment of health conditions was standardised, although this might have introduced underreporting or misclassification bias. A study based on a German population, found high agreement between 8 self-reported conditions and physician diagnoses, but found low agreement for arthritis39. In the UK-based English Longitudinal Study of Ageing (ELSA), half of respondents did not self-report a condition that was captured in their historical hospital records40. However, in ELSA, health conditions were reported in a questionnaire completed by participants, whereas in UKB, this information was obtained during a guided interview with trained nurses. Diagnoses obtained from primary care records could address certain limitations of self-reporting conditions, however, this data is currently only available for less than half of the UKB cohort. Third, despite the large sample size, there was a low prevalence of certain health conditions known to have substantial impacts on quality and disability-adjusted life years, including neurological diseases and mental health conditions. Fourth, multimorbidity varies based on the included conditions, and the Jani et al adaption of the Barnett definition might have missed important contributors to multimorbid clusters. Future studies should explore additional approaches for defining multimorbidity. Fifth, health conditions for the full sample were only self-reported at baseline assessment and we were unable to incorporate incident health conditions into the exposure definition. Understanding the trajectories of health conditions in populations with repeat measures could provide important insights into multimorbidity development and mortality risk. Sixth, residual confounding remains and causality cannot be determined due to the observational design of the current study. In the context of multimorbidity, in particular clusters which consist of a diverse range of conditions, the distinction between confounders and mediators warrants further exploration. For example, BMI might be on the causal pathway for multmorbid cardiometabolic conditions and a confounding factor for conditions less impacted by adiposity. Our findings were generally consistent when additionally adjusting for lifestyle factors, however, future research using longitudinal lifestyle measures could help elucidate the potential causal pathways underlying these associations.

Conclusion

We found that an increasing number of multimorbid conditions was associated with a greater risk of all-cause mortality, with particularly strong dose–response associations observed with cardiovascular and respiratory causes of death. Our findings also highlight the importance of understanding the role of sex- and age-specific clusters of disease that affect various organ systems when assessing the impact of multimorbidity on future health outcomes.