Abstract
Income loss after breast cancer is an important area of research, but long-term impacts and heterogeneous effects are understudied. Here, we measure the effect of breast cancer on personal and household equivalized income of females in Denmark from 2000-2018 using Danish register data. We perform a cohort study and match on age, income, employment, comorbidities, healthcare utilization, and other socioeconomic and demographic variables at cohort entry, to investigate impacts across subpopulations over 10 years. We also perform an event study difference-in-difference analysis over the same period. Our study finds personal income loss averaging €7138 over 10 years in the non-retired population. The working population, students, and those in worse health suffer disproportionate loss. Household income losses are smaller and recover within 10 years. Overall, the Danish economic protections are effective in mitigating long-term economic impact of breast cancer, though there is room for improvement in protection of the most vulnerable.
Similar content being viewed by others
Introduction
Breast cancer is disabling and often fatal, with a global prevalence of 7.8 million and 685,000 deaths caused by the disease worldwide in 20201. Projections of breast cancer incidence estimate an increase to over 3 million incident cases in 2040, up 40.8% from global incidence rates in 20202. Beyond the health and mortality burden, individuals diagnosed with breast cancer and their families also suffer economic consequences that exacerbate the lived experience of the disease. At an individual and household level, financial hardship due to income loss, premature mortality, or informal care giving burden can present long-lasting challenges which threaten financial stability3. From a population perspective, productivity losses due to breast cancer are a substantial global challenge and are projected to total 4.3% of global GDP between 2020 and 20504.
The Lancet Commission on Breast Cancer has highlighted the need for country-specific, context-relevant research on the financial impacts of breast cancer to better understand the economic impact and identify vulnerable groups1. Existing studies on some sub-types of breast cancer estimate indirect costs ranging from $3325 to $5169 USD annually across various countries in North America and northern Europe5. However, most of these studies are limited to short-term impacts post-diagnosis and lack a robust causal inference framework to isolate the direct economic impacts of breast cancer.
Denmark’s comprehensive registry data system offers a unique opportunity for longitudinal causal inference, enabling the isolation of economic impact by inferring counterfactual outcomes for individuals had they not been diagnosed with breast cancer. Matching methods, well-suited for estimating counterfactuals from observational data, can effectively attribute differences in outcomes to disease exposure6. These methods produce interpretable and actionable results for health policymakers7.
Additionally, the Danish welfare system provides an interesting context for investigation of income losses. Working individuals in Denmark can have 30 days of absence due to illness in a 9 month period paid by their employer, after which they can qualify for state funded sick leave (sygdagepenge)8. Sick leave is available for up to 22 weeks, and thereafter individuals who cannot return to work can apply for disability pension (førtidspension) or social assistance (kontanthjælp)9,10. Aside from these welfare benefits, individuals out of the labor market in Denmark may receive income from education allowances, unemployment benefits (dagpenge), early retirement (efterløn), and state-sponsored retirement pension11. Denmark’s universal welfare system makes it an outlier compared to economic peers in terms of the support for those with a disabling disease like breast cancer, but empirical evaluation is necessary to measure the success of these policies in living up to their philosophical ideals12.
Several studies utilizing Danish registry data have applied matching methods on register data to estimate the economic impact of breast cancer, revealing losses in the range of 2.7–3.2% of disposable personal income in the third year after diagnosis, with the gap in income between the breast cancer population and the unexposed population narrowing over time13,14,15,16. However, these studies are limited by short follow up periods, a focus on the working-age population, exclusion of any impact due to mortality, and insufficient adjustment for confounders.
Our study addresses these limitations by examining the long-term impacts of breast cancer on personal and household equivalized disposable income across the Danish population from 2000 to 2018, including an analysis of heterogeneity across sub-populations. We employ a matching approach to adjust for confounders beyond the basics (such as age, baseline income, education), and expand to include employment status, household size, Charlson Comorbidity Index (CCI), total hospital utilization, and relationship status, allowing us to estimate average causal effects on the exposed (ACE) for personal and household disposable income over a 10-year period post-diagnosis. This comprehensive matching enables us to investigate conditional average causal effects on the exposed (CACE) within the most vulnerable sub-populations, identifying groups that could benefit from targeted policy interventions. To validate and ensure the robustness of our findings, we compare our results with estimates from models matched on fewer dimensions and an event study difference-in-difference, addressing potential confounding and endogeneity issues inherent in causal studies. The results show that individuals with breast cancer sustain income losses through the tenth year following diagnosis; that the income losses are heterogeneously distributed across the population; and that the choice of causal method significantly influences the measured effects.
Results
Data and sample size
In total, 61,656 individuals were diagnosed with breast cancer during our study period. 78,678,324 person-years were included in the study period, with an average contribution of 13.6 person-years per individual. The median age of breast cancer diagnosis was 63 (Inter quartile range 53–71). The median personal income at the index was €42,128 (€27,131–€59,460), and the median household equivalized income (which controls for household size) was €33,765 (€25,270–€44,296). Summary statistics describing the breast cancer and general female populations for a sample year of observation (2015) is displayed in Table 1. Descriptives are shown for the most recent match year of analysis to avoid reporting duplicate counting, as individuals are eligible to act as controls for every year of the data that they qualify. The population characteristics shown for the exposed and control groups do not vary strongly year over year.
Matching using socio-economic, demographic, and health variables produced 43,915 match stratification subgroups, with group sample sizes reaching a maximum of 8416 in any group. 10.8% of individuals with breast cancer in the personal income models and 7.8% in the household income models are lost in estimation because an appropriate control was not identified. 8.4% of the match groups have fewer than 5 individuals, and the median group size was 67 in the personal income models. For household income, 11.0% of the match groups had fewer than 5 individuals and the median group size was 63.
Income losses and subgroup variation
When exact matching on all baseline socio-economic, demographic and health variables and censoring at death, the total ACE of breast cancer on personal income for non-retirees is €7138 (95% CI €6777–€7498) in personal income loss over 10 years, representing 1.50% (1.40–1.55%) of the income of the control population over this time period. Losses for the entire population including retirees totaled €3636 (€3392–€3881) over the period, which reflects the protection of retirees by the fixed pension payouts which are agnostic to disease status. When looking at point-in-time estimates for each year, losses for non-retirees are significant and reach highest levels at 5 years after diagnosis at €1094 (€981–€1207), after which they fall but remain significant up to 10 years after diagnosis. This equates to ~8% of the annual average budget for housing and utilities in Denmark, or half of the budget for restaurants and hotel accommodations17. Figure 1 shows year-specific CACE estimates of personal and household income loss for the non-retired population.
Income loss is shown in both absolute terms (adjusted to 2023 EUR; first panel) and relative terms (as a proportion of control group income; second panel) over years since diagnosis. Data are presented as mean values with 95% confidence intervals, where intervals are calculated from the standard deviation of conditional mean income in each subgroup. Income type is indicated by point color (red = personal disposable income loss, blue = household equivalized disposable income loss). Significance at α = 0.05 is indicated by point shading (filled dot = significant; hollow dot = not significant) based on two-sided t-tests. Estimates are based on n = 30,802 individuals with disease and n = 4,317,385 matched controls for personal income loss, and n = 30,664 individuals with disease and n = 4,144,538 matched controls for household equivalized disposable income loss. The unit of study is the individual. All observations are biological replicates, derived from a national register-based cohort. Control groups were matched on age, sex, and baseline demographic, socio-economic, and health characteristics. Effects reflect income among survivors in each follow-up year. Source data are provided with this paper.
All-cause mortality rates in the breast cancer population are higher than in the control: the mortality rate in the breast cancer population is 2% at 3 years and 8% at 10 years after index year, compared to 0% at 3 years and 1% at 10 years for the controls. Our secondary analysis accounting for mortality by setting years after death but before age 65 or end of follow-up to zero income produces larger economic gaps between the exposed and unexposed. With this adjustment, the total ACE of breast cancer on personal income is €34,547 (€34,107–€34,987) in the non-retired population, and total losses in the population is €1.034 billion across Denmark from 2000 to 2018. Mortality is highest in the 18–29 age group at 18% after 10 years, and income losses are €89,694 (€87,346–€92,042) when accounting for mortality in this age group, compared to €745 (−€1208–€2699) when censoring at death.
Effects on household income are statistically significant but much smaller in magnitude than personal income effects, amounting to €1544 (€1233–€1856) in the non-retired population and €493 (€282–€705) for the whole population over 10 years. Losses in the non-retired population constitute 0.40% (0.32−0.46%) of median household annual income in the control group over that time period. Household effects are highest in the third year after diagnosis, after which losses reduce and eventually recover within the 10-year period under study.
Substantial differences in the levels of income loss exist within subpopulations. For personal income, individuals in school at the time of diagnosis have the highest economic burden at €22,163 (€18,540–€25,786) or 4.90% (4.09–5.69%) of total control income over 10 years, followed by the population in the workforce at €9207 (€8748–€9665) or 1.70% (1.64–1.81%) and sick or parental leave recipients at €9009 (€7359–€10,659) or 2.50% (2.08–3.01%) of total income over 10 years. Retirees including early retirees experience an ACE of −€1609 (−€1931–−€1286) in personal income over the period, indicating that retirees with breast cancer have a slightly higher income than peers. For household income, individuals in the workforce experienced the greatest loss over 10 years, at €2433 (€2047–€2819) or 0.5% (0.45–0.62%) of control income. Individuals who were in school during breast cancer diagnosis had a positive increase in household income of €5102 (€7955–€2249) over the 10 year period and individuals on leave receive an increase of €4642 (€6305–€2978) in household income. No other labor-related subgroups experience a statistically significant loss in household income. Supplementary Table 1 describes the labor market attachment of the non-retired population under study from the year of matching to 10 years after diagnosis, and Supplementary Dataset 1 displays personal and household income losses by labor market status in absolute and relative terms for all subgroups.
Within the non-retired population, additional stratifications highlight differences in outcomes across subgroups. Fig. 2 shows loss of personal income by age, educational attainment, income quintile, hospital utilization and CCI at cohort entry for the nonretired population. Personal income losses increase with higher levels of education and higher income quartiles in both absolute and relative terms. Individuals in the lowest quartile of personal income at cohort entry have a €2221 (€1578–€2865), or 0.90% (0.64–1.17%) positive increase in income compared to controls over 10 years. Individuals with postgraduate education experienced the highest absolute loss, at €20,072 (€17,735–€22,408), or 2.60% (2.30–2.91%) of income in that period. Both health measures had a nonmonotonic relationship with income losses: individuals with breast cancer and a CCI of 3 have total losses of €22,945 (€19,160–€26,731) or 6.50% (5.45–7.61%), while lower CCI scores have smaller losses, as low as €4245 (€2673–€5817) or 1.00% (0.63–1.37%) for a CCI of 2. Individuals who have had 6–10 inpatient encounters in the third and second year before diagnosis lost €9985 (€9083–€10,886) or 2.20% (2.00–2.40%), while those with no inpatient encounters lost €6408 (€5768–€7047) or 1.30% (1.17–1.43%). Conversely, those with a CCI greater than 4 or with 11 or more inpatient encounters experience no measured effect on income. Breast cancer patients aged 40–45 have the highest income losses of any age group at €11,732 (€10,585–€12,879) or 2.10% (1.86–2.26%) and income losses are at a similar magnitude at prime working ages between 30 and 59. Individuals aged 18–29 have no significant income loss due to small sample size, and those aged 60–64 have losses as low as €2685 (€1878–€3492) or 0.70% (0.49–0.92%) of control income.
Each subplot shows absolute income loss by subgroup, with color indicating relative loss (as a percentage of matched control income over the same period). Error bars represent 95% confidence intervals around CACE point estimates. Solid markers denote statistically significant estimates based on two-sided t-tests at α = 0.05; hollow markers are not significant. Sample sizes for each exposed subgroup are shown in parentheses on the y-axis. The unit of analysis is the individual. All replicates are biological, derived from a Danish national register-based cohort. Control groups are matched on age, sex, and baseline demographic, socio-economic, and health characteristics, and matching baseline is defined as the year before diagnosis. Subgroup variables include baseline labor market attachment, highest achieved educational attainment at baseline, baseline personal income quartile measured for each calendar year, age at baseline, Charlson Comorbidity Index (CCI) in the 1–6 years preceding diagnosis, and number of hospitalizations in the 2 years preceding diagnosis. Source data are provided with this paper.
The relationship between household income loss to breast cancer and age, health, education, and baseline income are more complex but the magnitude of the subgroup effects are smaller and more frequently not statistically significant. Figures showing income loss by relationship and household size and the same results for household income loss and personal income loss adjusted for mortality are included in Supplementary Fig. 1.
Event study difference-in-difference
Event study difference-in-difference (DiD) models compare personal income from 5 years before to 10 years following breast cancer diagnoses to a control group matched on all variables in the main results except baseline income quartile. Results of the DiD estimation align with matching estimation in terms of the magnitude of the attributable loss of income, the increase in income loss over time, and the patterns across affected subgroups. Figure 3 shows coefficients for the income losses in the exposed group in the years before and after diagnosis, and coefficients for exposure interactions with the sociodemographic and health covariates used in the primary model are presented in the Supplementary Dataset 2.
Estimates reflect annual cumulative effects, with coefficients representing income differences relative to the reference year (−1), which is omitted from the plot. Data are presented as mean differences in 2023 EUR, with error bars indicating 95% confidence intervals. Solid points denote statistically significant differences based on two-sided t-tests at α = 0.05, and hollow points indicate non-significant estimates. Estimates are based on n = 30,802 individuals with disease and n = 4,317,385 matched controls, where individuals are included as controls in years prior to exposure. The unit of analysis is the individual. All replicates are biological, derived from Danish national register data. Control individuals were matched on age, sex, and baseline demographic, socio-economic, and health characteristics. Source data are provided with this paper.
Coefficients for the treatment-index interaction are near zero and not significant until the year of diagnosis, indicating that the parallel trends assumption is mostly met for this model. Coefficients after the event index are statistically significant, and loss increases over time until the ninth year of exposure where individuals with breast cancer lose €3931 (€3734–€4388) cumulatively compared to the non-exposed group. The adjusted R-squared for this fixed effects model was 0.875 with 955,229 observations included in the model.
Model specification and robustness
As a secondary analysis, full matching results for personal income were compared to 3 alternative matching designs on the non-retired population: age and sex alone, age, sex and education, and matching on all variables in final results except hospital utilization and CCI. Results were also compared to matching on all final variables 5 years before the year of diagnosis.
All comparison models produced lower estimates of the total effect than the model which matched on all covariates one year before diagnosis. Matching on age alone produces a €4927 (€5736–€4119) increase in income with breast cancer diagnosis. Age and education matching produces a loss of €3761 (€3256–€4265) in the exposed group. Matching on all variables except CCI and hospital utilization and matching in an earlier year before diagnosis produce estimates more similar to final estimates: losses in the former amount to €6716 (€6265–€7167), and in the latter are €5199 (€4608–€5761). Annual ACE on personal income for each comparison model are shown in Fig. 4.
Each line represents a separate matched cohort using different covariate sets. The primary model and 5-year lookback model are matched on all variables described in the Methods section, including sociodemographic and clinical characteristics. Additional models are matched on reduced covariate sets: age only, age and education, and all non-health variables. Data are shown as mean annual income loss in 2023 EUR, with error bars indicating 95% confidence intervals for each point estimate. Sample sizes for each model refer to the number of treated individuals and are reported in the legend. The unit of analysis is the individual. All replicates are biological, drawn from Danish national register data. Control groups are matched without replacement from the same source population. Source data are provided with this paper.
Discussion
In this study we estimate income lost due to breast cancer in Denmark from 2000 to 2018. We use a cohort study design on Danish register data and match individuals with breast cancer to a control group on age, socioeconomic status, household composition, and two markers of health at the calendar year of cohort entry, and we measure personal and household income losses in the 10 years following diagnosis. Through this study we make two contributions to the literature on this topic: first, an accounting of longitudinal income losses due to breast cancer in the Danish setting and identification of vulnerable sub-populations, and second, an updated and rigorous methodology which we validate against other causal inference strategies.
Our primary results highlight effects of breast cancer on income in Denmark that are negative, persistent, and statistically significant, but smaller in magnitude than multinational studies of indirect costs5. Personal income losses for the workforce and short-term unemployed are consistent with other Danish register-based studies which employ similar methods, and greater than those which do not account for the income differences between the breast cancer population and the general population before diagnosis13,14,15,16. Income losses at the household are approximately a quarter of the magnitude of personal income losses and less than half of one percent of the average non-retired household income of the control over 10 years. Both outcomes represent a small proportion of the average household budget in Denmark, and both income loss measures recover within 10 years of diagnosis. Although these economic losses are modest, it is important to recognize the outsized effect that premature mortality has on the population with breast cancer and it is likely that the most economically vulnerable are also those at highest risk of early death, skewing income loss results towards the survivors.
These results shed light on the ability of social and societal structures to mitigate the economic harm caused by breast cancer. Denmark’s sick leave policies provide economic protections for the employed, self-employed and unemployed equal to prior salary or unemployment benefit for up to 22 weeks within a 9 month period which would have a tampering effect on the amount of short-term economic harm that breast cancer could cause8. Long-term disability pension and social assistance welfare offer long-term economic protections if full labor attachment is not possible after a cancer diagnosis9,10. In the years following diagnosis, individuals with breast cancer are more likely to be receiving income from sick leave, disability pension, and social assistance than their peers, indicating that these state functions are being employed by those affected by breast cancer which would consequently buffer the effects of lost wages. Nonetheless, individuals on fixed income before diagnosis including those in school, those on leave and the unemployed population still experience losses compared to individuals with the same economic situation but without a breast cancer diagnosis, indicating the potential of long-term accumulation of income effects which are not fully mitigated by the Danish welfare system. Further research is warranted into the trajectory of income loss and substitutions within these populations.
In addition to societal economic protections in Denmark, our estimates indicate household-level response to a breast cancer diagnosis where losses of personal income are supplemented by increases in income of other household members. At baseline, household income is lower for both the breast cancer and control populations than personal income, indicating that a substantial proportion of the study population is households with children living at home. Differences in household income between exposed and control wage earners are negligible in the 7th year after diagnosis onwards and the gap in their personal and household income starts to close as early as the 6th year after diagnosis, indicating that people who survive a breast cancer diagnosis are able to recover their economic standing over time. This recovery may be driven by retirement during the observation period where income would not vary by disease status, and by mortality in the breast cancer population so that the individuals who live 10 years after diagnosis are in remission or may be healthiest within the breast cancer population.
Conditional average effects due to exposure along matching variables highlight particular subgroups of individuals who are more vulnerable to economic burden. Individuals in school at the time of diagnosis are a small proportion of the total affected population, but they experience a very high loss of personal income which does not fully recover after 10 years. The strong positive effect of cancer on household income for these individuals may indicate that many move home with parents or other family, which may protect from financial stress but may create other social or work and school-related consequences which deserve further study. Additionally, individuals with higher hospital utilization before diagnosis and those with a higher CCI have more profound and persistent income effects than those in better health at baseline. These increasing losses reflect the financial challenges which can be brought on by multimorbidity and a gap in protection for those who are already in poor health. The experience of these sub-populations among others is not reflected in previous studies which measure economic effects on the total active workforce only.
An important objective of our analysis was to be comprehensive of all potential experiences of income trajectories affected by breast cancer. By forgoing exclusion criteria at onset, we represent the whole population in our analysis, and by matching across many potential confounders and investigating heterogeneous effects we can reflect the comparative magnitude of economic burden across sub-populations and identify specific subgroups of interest who experience disproportionately high burden. Breast cancer has a high mortality rate, and by accounting for loss of income due to excess mortality in our secondary analysis we estimate the economic burden that could be alleviated by improvements in breast cancer survival. The methodological approach we take allows for comparison to other disease burdens in Denmark and comparison to other countries, where more narrow approaches are less directly comparable. Additionally, the comprehensive matching approach enables within-country comparison of heterogeneous effects, which is a valuable addition to the existing literature that uses Danish register data to study the economic burden of cancer.
As with any observational cohort study, potential confounding and the comparability of exposed and control groups is highly deterministic of the validity of results. Comparisons of our primary results to those using different combinations of matching variables demonstrate this dependency. Matching on age alone or age and baseline education underestimate economic impacts on the non-retired population substantially. Including income quantile, household size, and relationship status as matching variables at cohort entry in addition to age and education is crucial to mitigate confounding produced by socio-demographic characteristics. Personal income losses on the non-retired population are 5% lower than our primary results when matching on all variables except hospitalization history and CCI. Our results do not control for long-term behavioral risk factors such as smoking which could be confounders, but the small contribution of utilization and comorbidity to measured effects suggests that the omitted variable bias driven by behavioral risk is small. Additionally, the results of our matching analysis are consistent with those from the event study difference-in-difference, suggesting limited bias resulting from other omitted potential confounders.
There are some important limitations to our analysis. First, we measure the economic burden of breast cancer over a span of nearly two decades, during which screening and treatment for breast cancer have advanced to reduce mortality and improve early detection. Our results do not account for changes in the underlying constitution of the breast cancer population over this time period, nor do they stratify by the type or stage of breast cancer at diagnosis due to limitations in the available data. Future work is necessary to understand how disease prognosis and treatment outcomes determine income loss patterns. Second, disaggregation between sources of income into wages and welfare benefits, though relevant to the broader topic of economic burden, was beyond the scope of this analysis due to data limitations. Consideration of all types of disposable income best reflects our perspective which is concerned with individual experience of income loss. However, estimation of direct wage loss and separate measurement of the mitigating effects of welfare payment would be highly relevant information for policymakers. Furthermore, income losses due to cancer do not capture other measures of economic well-being such as wealth or the value of property. For the retired population which receives a fixed income, wealth may be a more important measure of economic burden.
To conclude, our study offers insights into the long-term economic impact of breast cancer on personal and household income in Denmark. While financial losses exist across the population, they are mitigated by robust social safety nets such as Denmark’s sick leave and disability policies, and household-level financial adaptations. However, certain subgroups including students and individuals with poor baseline health, experience a disproportionate economic burden that persists long after diagnosis. Our comprehensive methodology captures heterogeneous effects across the population and adds rigor to the literature on the economic consequences of breast cancer, and we suggest that similar approaches, which control for confounding and follow a longer period, should be adopted in other country-level cost of illness studies where data allows. Future research should explore additional dimensions of economic well-being, such as wealth, and the effects of evolving cancer treatments.
Methods
This study complies with all relevant ethical regulations governing the use of register data for research, which are overseen by the Danish Data Protection Agency. Study design and data usage were approved by the Research and Innovation Organization at the University of Southern Denmark (number 11.181), and data access was overseen by Statistiks Denmark. Per Danish legislation, informed consent is not required for the use of register data for research.
Data
We perform a cohort study using comprehensive observational data from the Danish administrative registries, which capture 4,348,187 female Danish residents over our observation period. Breast cancer patients are identified by the Danish Breast Cancer Registry. Membership in this registry is validated and captures the entire female population with invasive breast tumors18. We include all cases of breast cancer regardless of stage and subtype at diagnosis, as the register data does not distinguish these cancer characteristics at diagnosis from those recorded as the cancer progresses. Male breast cancer cases were excluded as they constitute less than 3% of breast cancer cases in Denmark and are not included in the Danish Breast Cancer Registry. Gender is not available in the Danish registers. Economic burden is sourced from the Danish Income Statistics Registry, and we collect disposable personal income and disposable equivalized household income which adjusts total household earnings for the household size19,20. Disposable income was chosen as an outcome measure to reflect the ultimate financial experience of individuals with breast cancer, and as stratifications by income source deserve their own body of work they were considered out of scope for this analysis. Both income measures are inflation-adjusted to 2023 Euros21,22. Additional demographic, health, and education registries are used for identification of controls and modeling of potential socio-demographic confounders. A full list of the registries used in this analysis are described in Table 2 below19,23. Definitions of each matching variable and detailed descriptions of the contents of each register are included in Supplementary Dataset 3.
Matching strategy
In all analysis, the exposed breast cancer group is defined as all females in Denmark with an incident breast cancer diagnosis between 2000 and 2015 and no history of breast cancer for at least 5 years prior to date of diagnosis. A 5-year washout window was chosen in order to maintain consistency in the length of washout for each year of data, and to align with clinical guidelines which reduce screening for recurrence after 5 years. Personal disposable and household equivalized disposable income (referred to as personal and household income henceforth) are measured in each year after diagnosis year for 10 years or until censorship or the last year of data (2018). Disposable income sources in Denmark can include net salary after taxes, retirement pension, or state-sponsored financial supports including student allowances, sick and parental leave, unemployment, and disability payments.
For each cohort of diagnosed individuals in a year, we match exactly to an unexposed control population on age group, hospital utilization, Charlson Comorbidity Index (CCI), level of highest completed education, labor market attachment, income quantile, relationship status, and household size (see Supplementary Dataset 1)24. Matching is done one year prior to the year of diagnosis for the exposed group for all variables except hospital utilization and CCI, which are matched 2 years before diagnosis. Healthcare related variables are matched in earlier periods to prevent confounding if individuals exposed to breast cancer seek treatment for symptoms before diagnosis. A diagram of the study design for matching is shown in Fig. 5.
Statistics and reproducibility
In this observational cohort study, we compare income trajectories of our exposed female breast cancer population to matched control groups for 10 years following exposure. Only individuals who could not be matched to an eligible control were excluded, and the full female population in Denmark was studied, so sample size determination was irrelevant. As this is an observational study, no randomization was performed.
Measurement of causal effects across matched subgroups and over time was done using the following procedure. First, differences in income for each year i after match year and within each subgroup are averaged for exposed and control groups to estimate point-in-time conditional average causal effect (CACEi), as shown in Eq. (1).
Where Ye,i,j is the income of the exposed individual j in subgroup s at time i, Yc,i,k is the income of the control individual k in subgroup s at time i, and ne,s and nc,s are the number of exposed and control individuals in subgroup s, respectively. Next, point-in-time average causal effect of the exposed (ACXi) is estimated by taking a weighted average of CACEi where weights are determined by the sample size of exposed individuals in each subgroup, shown in Eq. (2).
Where \({w}_{s}=\frac{{n}_{e,s}}{{n}_{e}}\) is the weight for subgroup s, determined by the proportion of exposed individuals in that subgroup relative to the total number of exposed individuals ne, and S is the total number of subgroups. Finally, cumulative average causal effects (ACE) are estimated by adding ACXi over years 1 through 10. Cumulative conditional treatment effects are also estimated by taking stratified weighted averages of CACEi and then adding across years. By summing individual year effects, we avoid exclusion of individuals who leave the sample during the period of observation, and instead we calculate each measured effect in the years after matching using the entire population in the data at that point in time. For each estimate, we report 95% confidence intervals constructed from the variance of each measured effect.
Handling of mortality
Differential mortality rates over time between the control and exposed populations, and within exposed subpopulations, have implications for the interpretation of cumulative income loss. Survival in each year is required to observe impacts on income in that year, but this skews the population distribution overtime to the population who is able to survive.
In our primary analysis, we condition on survival so that income loss measured in each year is not adjusted for the mortality rate in each match group. As a secondary analysis, we aim to illustrate the effect of premature mortality alongside income losses in the survivors, to demonstrate the theoretical value lost by early death at the population level. To do so, we conduct a secondary analysis where we consider years after death before the Danish retirement age of 65 as years of zero income. We impute zero- personal income years for any deaths in the exposed or control groups during the observation period occurring before age 65 until 2018, 10 years of follow-up, or the year that the individual would have reached age 65. Deaths of any cause for the control or exposed population are included. This was only possible for personal income loss, as household income losses would require identification of all income earners who constitute the household and could not be applied to equivalized household income since this measure is adjusted for the size of the household in each year.
Unlike our primary results, this secondary analysis does not conform to the perspective of individual experienced income loss. The income losses reported in this secondary analysis are intended as an exploratory analysis to illuminate the outsized impact of mortality.
Matching analysis comparisons
Many existing studies on this topic match on a much more limited set of variables. Conversely, many behavioral risk factors including physical inactivity and alcohol usage history may be important but are not available in Danish registers1. To compare our results to similar studies, we also test exact matching on age alone, and on age and education, on the non-retired population in our dataset. To evaluate the omitted variable bias that could be present without accounting for behavior, we run models with all original covariates except hospital utilization and CCI, with the assumption that results sensitive to available health information would also be subject to bias. We also test a 5-year lookback period for matching on all variables in case of changes in income due to illness before an oncologist-confirmed breast cancer diagnosis. For all comparison models, we estimate total income loss in the non-retired population and the income loss in each year after diagnosis.
Event study difference-in-difference
To ensure our results are robust to unobserved confounding we also conduct a difference-in-difference (DiD) event study analysis. Event study difference-in-difference design controls for time-invariant omitted variables which are not accounted for in matching alone, and which may include behavioral risk patterns, industry and job type in the working population, and demographic characteristics such as the age at first child for mothers. The event study design mitigates the critique from Goodman-Bacon et al.25 of DiD studies with time variant treatment by using fixed effects for each year before and after treatment, thereby explicitly controlling for treatment timing.
We conduct this sub-analysis using data matched in the year before diagnosis on all covariates used in the primary analysis except baseline income quintile. Baseline income quintile before diagnosis is excluded as a matching variable so as to avoid controlling for the first difference in DiD estimation between the breast cancer and unexposed populations. We measure income from five years before breast cancer or match year to 10 years following the matching. Fixed effects are included for the interaction between exposure and the lag or lead to event, for the calendar year, and for each of the matching covariates. Equation (3) shows the two-way fixed effects model used for event study difference-in-difference.
Where Yjt is the average income for individuals in group j and exposure status t, EventTimejt is a dummy indicator for the year before or after the event year, Treatt is an indicator for exposure status, ϕj is a matrix of covariate fixed effects, λt is a year fixed effect, and ϵjt is random and clustered by year. The event study model is performed on tabulated data at the subgroup level and was weighted by the sample size of the subgroup due to data size and memory limitations. Subgroups are defined along matching dimensions (for example, females living alone in the central region, aged 30–35, employed with bachelor’s degrees, with no comorbidity and no hospitalizations). Income loss due to breast cancer for each year is reflected in the coefficient on the interaction term between EventTimejt and Treatt. Modeling was done in R using fixed effects ordinary least squares estimation26. Clustered standard errors from the DiD estimation were used to construct 95% confidence intervals for each coefficient value.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Input data used in this analysis is provided by Statistics Denmark and the Denmark Breast Cancer Group and is protected by the Danish Data Protection Agency. The raw input data are protected and are not available due to data privacy laws. However, researchers can apply for access to Danish register data for scientific purposes if they meet provider requirements, including affiliation to a Danish research group and an approved project proposal. More information on this process and requirements is available at https://www.dst.dk/en/TilSalg/data-til-forskning. Data from the results of this analysis including the heterogeneous effects by each matching stratum are available in the Supplementary Dataset 1. The source data for all figures and tables are included with this paper in the Source Data files (Supplementary Dataset 4). Source data are provided with this paper.
Code availability
All code used in the statistical analysis and production of tables and figures for this manuscript is available at https://github.com/ekjohnsonphd/breast_cancer_income_loss27. Data processing and analysis was done using R version 4.4.3. Required packages: arrow_16.1.0, lubridate_1.9.3, forcats_1.0.0, stringr_1.5.1, dplyr_1.1.4, purrr_1.0.2, readr_2.1.5, tidyr_1.3.1, tibble_3.2.1, ggplot2_3.5.1, tidyverse_2.0.0, data.table_1.15.4, MatchIt_4.7.1, cowplot_1.1.3.
References
Coles, C. E. et al. The Lancet Breast Cancer Commission. Lancet https://doi.org/10.1016/S0140-6736(24)00747-5 (2024).
Arnold, M. et al. Current and future burden of breast cancer: Global statistics for 2020 and 2040. Breast 66, 15–23 (2022).
Mohammadpour, S. et al. A systemmatic literature review on indirect costs of women with breast cancer. Cost. Eff. Resour. Alloc. 20, 68 (2022).
Chen, S. et al. Estimates and Projections of the Global Economic Cost of 29 Cancers in 204 Countries and Territories From 2020 to 2050. JAMA Oncol. 9, 465–472 (2023).
Huang, M. et al. Economic and Humanistic Burden of Triple-Negative Breast Cancer: A Systematic Literature Review. PharmacoEconomics 40, 519–558 (2022).
Stuart, E. A. Matching methods for causal inference: A review and a look forward. Stat. Sci. 25, 1–21 (2010).
Parikh, H., Rudin, C. & Volfovsky, A. MALTS: matching after learning to stretch. J. Mach. Learn. Res. 23, 10993 (2022).
Styrelsen for Arbejdsmarked og Rekruttering. Sygedagpenge. https://www.borger.dk/arbejde-dagpenge-ferie/Dagpenge-kontanthjaelp-og-sygedagpenge/sygedagpenge. (Accessed in November 2024).
Udbetaling Danmark. Førtidspension. https://www.borger.dk/pension-og-efterloen/Foertidspension-oversigt. (Accessed in November 2024).
Styrelsen for Arbejdsmarked og Rekruttering. Kontanthjælp - er du under 30 år og har du en uddannelse? https://www.borger.dk/arbejde-dagpenge-ferie/Dagpenge-kontanthjaelp-og-sygedagpenge/Kontanthjaelp/Kontanthjaelp-under-30-med-uddannelse. (Accessed in November 2024).
Styrelsen for Arbejdsmarked og Rekruttering. Arbejdsløshedsdagpenge. https://www.borger.dk/arbejde-dagpenge-ferie/Dagpenge-kontanthjaelp-og-sygedagpenge/Arbejdsloeshedsdagpenge. (Accessed in November 2024).
Knutsen, O. The Nordic Models in Political Science: Challenged, but Still Viable? (Fagbokforlaget, 2017).
Heinesen, E. & Kolodziejczyk, C. Effects of breast and colorectal cancer on labour market outcomes—Average effects and educational gradients. J. Health Econ. 32, 1028–1042 (2013).
Andersen, I., Kolodziejczyk, C., Thielen, K., Heinesen, E. & Diderichsen, F. The effect of breast cancer on personal income three years after diagnosis by cancer stage and education: a register-based cohort study among Danish females. BMC Public Health 15, 50 (2015).
Jensen, L. S. et al. The long-term financial consequences of breast cancer: a Danish registry-based cohort study. BMC Public Health 17, 853 (2017).
Spanggaard, M., Olsen, J., Jensen, K. F. & Anderson, M. Cost of illness of HER2-positive and metastatic and recurrent HER2-positive breast cancer - a Danish register-based study from 2005 to 2016. BMC Health Serv. Res 22, 745 (2022).
Documentation of statistics: Household Budget Survey. https://www.dst.dk/en/Statistik/dokumentation/documentationofstatistics/household-budget-survey. (Accessed in April 2025).
Cronin-Fenton, D. P. et al. Validity of Danish Breast Cancer Group (DBCG) registry data used in the predictors of breast cancer recurrence (ProBeCaRe) premenopausal breast cancer cohort study. Acta Oncol. 56, 1155–1160 (2017).
Petersson, F., Baadsgaard, M. & Thygesen, L. C. Danish registers on personal labour market affiliation. Scand. J. Public Health 39, 95–98 (2011).
Ejlskov, L. & Plana-Ripoll, O. Income in epidemiological research: a guide to measurement and analytical treatment with a case study on mental disorders and mortality. J. Epidemiol. Community Health jech-2024-223206 https://doi.org/10.1136/jech-2024-223206 (2025).
Danmarks Statistik. Documentation of statistics: Consumer Price Index. https://www.dst.dk/en/Statistik/dokumentation/documentationofstatistics/consumer-price-index. (Accessed in November 2024).
Danmarks Statistik. Exchange rates. https://www.dst.dk/en/Statistik/emner/oekonomi/finansielle-markeder/valutakurser. (Accessed in November 2024).
Jensen, V. M. & Rasmussen, A. W. Danish education registers. Scand. J. Public Health 39, 91–94 (2011).
Charlson, M. E., Pompei, P., Ales, K. L. & MacKenzie, C. R. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J. Chronic Dis. 40, 373–383 (1987).
Goodman-Bacon, A. Difference-in-differences with variation in treatment timing. J. Econ. 225, 254–277 (2021).
Bergé, L. Efficient estimation of maximum likelihood models with multiple fixed-effects: the R package FENmlm. CREA Discussion Papers (CREA, 2018).
Johnson, E. K. ekjohnsonphd/breast_cancer_income_loss: Public release of analysis code and data, breast cancer and income loss. Zenodo https://doi.org/10.5281/ZENODO.16527653 (2025).
Acknowledgements
Funding for this work was provided by grants from EU Marie Skłodowska-Curie Actions (MC), Helsefonden and Danmarks Frie Forskingfond (DFF), all awarded to A.Y.C. Salary support for A.Y.C. was covered by MC, salary support for E.K.J. was covered by Helsefonden, and purchase of data and salary support for L.S. was provided by DFF. All other authors received no financial support for their contributions to the project. Financial sponsors for this work had no role in study design, data analysis and interpretation, or drafting of the manuscript.
Author information
Authors and Affiliations
Contributions
AYC secured funding and conceived the study. All authors (E.K.J., H.P., K.R.O., A.Y.C., L.S.) designed the methodology and validated the results. EKJ conducted the analysis and interpretation and drafted the manuscript. All authors contributed to the revision of the manuscript and approved the final version. All authors agree to be accountable for all aspects of the work.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Peter Hall, who co-reviewed with Giovanni Tramonti, and the other anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Johnson, E.K., Parikh, H., Olsen, K.R. et al. Breast cancer and income loss in Denmark: heterogeneous outcomes and longitudinal effects. Nat Commun 16, 11576 (2025). https://doi.org/10.1038/s41467-025-66524-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-66524-y







