Main

Iron deficiency is common in patients with heart failure (HF) and associated with more severe symptoms, impaired quality of life (QoL) and exercise capacity, an increase in hospitalizations, particularly for HF, and a higher mortality1,2,3,4. Both American and European guidelines recommend intravenous (i.v.) iron therapy to improve symptoms and QoL in patients with HF and a reduced left ventricular ejection fraction (LVEF) (HFrEF) and iron deficiency5,6, but uncertainty persists about the effects of i.v. iron on hospitalizations for HF and mortality7,8,9,10,11,12.

Both the AFFIRM-AHF (a randomized, double-blind placebo-controlled trial comparing the effect of intravenous ferric carboxymaltose on hospitalizations and mortality in iron-deficient subjects admitted for acute heart failure) and IRONMAN (effectiveness of intravenous iron treatment versus standard care in patients with heart failure and iron deficiency) trials narrowly missed their primary endpoints of recurrent HF hospitalization and cardiovascular death7,10, but statistical significance was observed after applying pre-specified analyses to mitigate the effects of the COVID-19 pandemic. Using a pre-specified 99% confidence interval (CI), the HEART-FID (ferric carboxymaltose in heart failure with iron deficiency) trial did not show a significant effect of i.v. iron on the composite of mortality, recurrent HF hospitalization and 6-min walking distance, although the result was significant using a conventional 95% CI11.

Several meta-analyses have previously been reported13,14,15,16, concluding that i.v. iron probably reduces the risk of HF hospitalizations but has little effect on cardiovascular or all-cause mortality. Some analyses suggest that the benefits of i.v. iron might be greater in those with a transferrin saturation (TSAT) < 20% and that this criterion alone should be used to define iron deficiency15. Furthermore, a Bayesian meta-analysis suggested residual uncertainty about the effects of i.v. iron on the composite of recurrent HF hospitalization and cardiovascular mortality13.

The most recent trial, FAIR-HF2, provides substantial additional information that may help address these uncertainties12. Therefore, we performed an updated systematic review and meta-analysis to estimate the effect of i.v. iron on key clinical outcomes in patients with HF and iron deficiency overall, as well as in key subgroups.

Results

Search and study characteristics

The initial search identified 572 potentially relevant articles, but only 6 randomized trials met the inclusion criteria (Extended Data Fig. 1), which had enrolled 7,175 patients, including 3,672 randomized to i.v. iron and 3,503 to control groups. The median duration of follow-up ranged from 6 months to 32 months. The mean or median participant age ranged from 67 years to 74 years and most were men (64%). At baseline, about half the patients were anemic and the percentage with a TSAT < 20% ranged from 40% (Heart-FID) to 83% (AFFIRM-AHF) (Table 1). Five of the six trials used ferric carboxymaltose as the i.v. iron formulation, whereas IRONMAN used ferric derisomaltose. Supplementary Table 1 summarizes the key characteristics of the populations in the included trials.

Table 1 Baseline characteristics of the included trials

All trials in this meta-analysis were multicenter and randomized. Although five trials were double blind, IRONMAN was open label; however, its primary endpoint (HF hospitalization and cardiovascular death) was assessed through blinded outcome adjudication to minimize bias. The included randomized controlled trials demonstrated good methodological quality with a low risk of bias in key domains sich as randomization, allocation concealment and outcome assessment. Supplementary Fig. 1 provides an overview of the quality assessment across the included studies.

Primary endpoint

Compared with patients assigned to control groups, those assigned to i.v. iron had significantly lower rates for the primary composite endpoint by 12 months (RR = 0.72 (95% CI = 0.55–0.89), (posterior) tail probability (PB)= 0.007, I2 = 47%; Fig. 1) and at the complete length of follow-up (RR = 0.81 (95% CI = 0.63–0.97), PB = 0.022, I2 = 46%; Fig. 2). Sensitivity analysis using the Knapp–Hartung method yielded similar results for both 12 months (RR = 0.73 (95% CI = 0.60–0.89), P = 0.010) and the complete length of follow-up (RR = 0.81 (95% CI = 0.67–0.98), P = 0.035).

Fig. 1: The effect of i.v. iron on the composite endpoint of total (first and recurrent) HF hospitalizations and cardiovascular mortality for the first 12 months of follow-up (PB = 0.007, I2 = 47%).
figure 1

The Forest plot illustrates the impact of i.v. iron on the composite endpoint of total (first and recurrent) HF hospitalizations and cardiovascular mortality during the first 12 months of follow-up using a Bayesian random-effects meta-analysis. Data are presented as RRs with 95% CIs. Sensitivity analyses were conducted by omitting the FAIR-HF and CONFIRM-HF studies, through the application of alternative HN priors and using the Knapp–Hartung (KnHa) approach to random-effects meta-analysis. n or N, no. of events or no. of participants in the group, respectively. The blue color indicates analysis using six trials and the red color analysis using four trials.

Fig. 2: The effect of i.v. iron on the composite endpoint of total (first and recurrent) HF hospitalizations and cardiovascular mortality over the complete length of follow-up (PB = 0.022, I2 = 46%).
figure 2

The Forest plot illustrates the impact of i.v. iron on the composite endpoint of total (first and recurrent) HF hospitalizations and cardiovascular mortality over the complete length of follow-up using a Bayesian random-effects meta-analysis. The data are presented as RRs with 95% CIs. Sensitivity analyses were conducted by omitting the FAIR-HF and CONFIRM-HF studies, through the application of alternative HN priors and using the KnHa approach to random-effects meta-analysis. The blue color indicates analysis using six trials and the red color analysis using four trials.

Key secondary endpoints

Recurrent HF hospitalizations

Compared with patients assigned to control groups, those assigned to i.v. iron had significantly lower rates for recurrent HF hospitalizations by 12 months (RR = 0.69 (95% CI = 0.48–0.88), PB = 0.009, I2 = 56%; Fig. 3) and at the complete length of follow-up (RR = 0.78 (95% CI = 0.55–0.98), PB = 0.028, I2 = 59%; Fig. 4). Sensitivity analysis using the Knapp–Hartung method yielded similar results for both 12 months (RR = 0.67 (95% CI = 0.49–0.91), P = 0.021) and complete length of follow-up (RR = 0.74 (95% CI = 0.52–1.06), P = 0.081).

Fig. 3: The effect of i.v. iron on recurrent HF hospitalizations for the first 12 months of follow-up (PB = 0.009, I2 = 56%).
figure 3

The Forest plot illustrates the impact of i.v. iron on recurrent HF hospitalizations during the first 12 months of follow-up using a Bayesian random-effects meta-analysis. Data are presented as RRs with 95% CIs. Sensitivity analyses were conducted by omitting the FAIR-HF and CONFIRM-HF studies, through the application of alternative HN priors and using the KnHa approach to random-effects meta-analysis. The blue color indicates analysis using six trials and the red color analysis using four trials.

Fig. 4: The effect of i.v. iron on recurrent HF hospitalizations over the complete length of follow-up (PB = 0.028, I2 = 59%).
figure 4

The Forest plot shows the effect of i.v. iron on recurrent HF hospitalizations over the complete length of follow-up using a Bayesian random-effects meta-analysis. Data are presented as RRs with 95% CIs. Sensitivity analyses were conducted by omitting the FAIR-HF and CONFIRM-HF studies, through the application of alternative HN priors and using the KnHa approach to random-effects meta-analysis. The blue color indicates analysis using six trials and the red color analysis using four trials.

All-cause and cardiovascular mortality

By 12 months, compared with patients assigned to control groups, those assigned to i.v. iron tended to have lower rates for cardiovascular (HR = 0.80 (95% CI = 0.61–1.03), PB = 0.071, I2 = 24%; Extended Data Fig. 2) and all-cause (HR = 0.82 (95% CI = 0.65–1.03), PB = 0.073, I2 = 25%; Extended Data Fig. 3) mortality. Sensitivity analysis using the Knapp–Hartung method yielded similar results (HR = 0.82 (95% CI = 0.65–1.02), P = 0.068) and (HR = 0.83 (95% CI = 0.69–1.01), P = 0.060), respectively. At the complete length of follow-up, these trends were attenuated (HR = 0.87 (95% = CI 0.73–1.04), PB = 0.096. I2 = 16%; Extended Data Fig. 4) and (HR = 0.92 (95% CI = 0.80–1.07), PB = 0.221, I2 = 16%; Extended Data Fig. 5), respectively. Sensitivity analysis using the Knapp–Hartung method again yielded similar results (HR = 0.87 (95% = CI 0.74–1.02), P = 0.070) and (HR = 0.92 (95% CI = 0.81–1.05), P = 0.162), respectively.

Safety endpoints

Those randomized to i.v. iron or a control group had a similar incidence of infection (odds ratio (OR) = 1.02 (95% CI = 0.66–1.59)) and serious adverse events (OR = 0.91 (95% CI = 0.70–1.15)) (Supplementary Figs. 2 and 3).

Subgroup analysis

No statistically significant treatment interactions were observed for the primary composite endpoint when patients were stratified according to age, ischemic versus nonischemic etiology, New York Heart Association (NYHA) class, estimated glomerular filtration rate (eGFR), hemoglobin (Hb), ferritin or TSAT (Table 2 and Supplementary Figs. 411). However, men appeared to obtain greater benefit than women (Supplementary Fig. 12). There was a significant interaction for sex (ratio of RRs (RRR) 1.40 [95% CI 1.05–1.86]; PB = 0.025, I2 = 23%), with women on average showing no benefit (RR = 0.98 (95% CI = 0.75–1.26)).

Table 2 Summary table of subgroup analysis (considering the complete length of follow-up)

Sensitivity analyses

Extended Data Tables 2 and 3 contrast the results based on the half-normal (HN) prior (with scale 0.5), with more optimistic (scale 0.1) and more conservative (scale 1.0) alternatives. The results were largely similar with all the three different priors used. Extended Data Tables 4 and 5 show the results of the primary endpoint and key secondary endpoints after exclusion of FAIR-HF and CONFIRM-HF trials, respectively. The results remained largely similar after exclusion of FAIR-HF and CONFIRM-HF trials. The sensitivity analyses using primary endpoint of time to first event for cardiovascular mortality and HF hospitalization are shown in Supplementary Figs. 13 and 14.

Discussion

This meta-analysis, comprising >7,000 patients, suggests that i.v. iron reduces the composite endpoint of total (first and recurrent) HF hospitalizations and cardiovascular mortality in patients with HF, LVEF < 50% and iron deficiency. The event rate reduction was 28% at 12 months and 19% for all available follow-ups. Both components of the primary endpoint contributed to these outcomes. The directionally positive results for all-cause mortality at 12 months and for all follow-ups document overall safety of i.v. iron therapy in patients with HF.

Benefit was observed most clearly during the first year of follow-up, a finding that may be explained, at least in part, by disruptions caused by the COVID-19 pandemic. In addition, we speculated that this is the result of the impact of higher doses of i.v. iron causing complete correction of iron deficiency (early in each trial). At later times in the trials (that is, after >12 months of follow-up, which was performed in IRONMAN, HEART-FID and FAIR-HF2) doses of i.v. iron were substantially less and adherence to therapy tended to be much less than intended. The average doses of i.v. iron in the first 12 months was approximately 2,000 mg, whereas, in years 2 and 3 of these trials, it was only 300–900 mg per year. It is important to highlight that the initial dose of i.v. iron varied across trials. For most trials, most of the i.v. iron was administered during the first 4–6 weeks, with few patients receiving doses thereafter (Extended Data Table 1). This might also explain the higher treatment effect observed with i.v. iron during the first year after randomization. Maintaining iron repletion throughout follow-up by further doses of i.v. iron, which was often impossible during COVID lockdown periods, might have prevented the attenuation of longer-term benefits of i.v. iron14,17. This deserves further exploration in future clinical trials.

Subgroup analyses, with the exception of sex, revealed no significant differences in the effect of i.v. iron on the primary endpoint, including those with a baseline TSAT < 20%. The effect of i.v. iron appeared to be greater in men. Compared with previous meta-analyses, an additional substantial trial (FAIR-HF2, with 1,105 patients) was included. Furthermore, individual-patient data were available from five trials, enabling application of identical subgroup definitions; it also allowed for harmonized analysis methods using the approach of the IRONMAN trial as an example and then applying both Bayesian and conservative frequentist approaches to determine treatment effects overall and in subgroups. It is, therefore, more robust than previous meta-analyses13,14,15,16.

Several trials have shown that i.v. iron improves symptoms and QoL in patients with HFrEF and iron deficiency9,18. This meta-analysis provides evidence that these improvements in well-being, which are of paramount importance to patients, are reflected in a reduction in HF hospitalizations, which not only cause substantial distress for patients but also place significant financial and logistical burdens on healthcare systems19.

Some reports have suggested that a TSAT < 20%, rather than serum ferritin, may be a better way to define iron deficiency and identify patients who derive greater benefit from i.v. iron11,20,21. Indeed, higher serum ferritin concentrations are associated with worse outcomes in patients with HF, probably because serum ferritin increases with inflammation, which may disguise the presence of iron deficiency3. It remains uncertain which blood tests most accurately reflect iron deficiency. Indeed, it is likely that some patients without iron deficiency were included in the trials and this may have diluted the benefits of i.v. iron supplementation. Importantly, FAIR-HF2 was the first trial to pre-specify an analysis of the effects of i.v. iron on the subgroup of patients with a TSAT < 20% as part of its primary outcome. Although patients with a TSAT < 20% had, overall, a higher rate of events, the effect of i.v. iron in relative terms was similar for those with a TSAT above or below 20%. This meta-analysis also failed to show a statistically significant interaction between the effects of i.v. iron on the primary outcome and TSAT. However, patients with a TSAT < 20% did have a worse prognosis and so, even if the relative benefits for those with a TSAT above or below 20% are similar, the absolute benefit will be greater for patients with a TSAT < 20%. These results help inform the debate on whether TSAT < 20% should be the sole criterion for identifying iron deficiency in patients with HF.

Previous analyses have suggested that patients with a nonischemic cause for HF might not benefit from i.v. iron22. We could not confirm this, nor did we observe any difference in effects based on age, NYHA class, eGFR, hemoglobin or ferritin. It is interesting that we did observe a statistically significant subgroup interaction according to sex, with women possibly deriving less benefit. This might be a chance finding or confounded by differences in patient characteristics such as age, underlying ischemic heart disease (IHD) or iron deficiency markers, and was not observed consistently across trials.

There are also concerns about the safety of giving high amounts of i.v. iron, but the recent FAIR-HF2 trial found that higher cumulative dosing with i.v. iron was safe and well tolerated. The current meta-analysis identified no safety concerns with infections or other serious adverse events, for which rates were similar in the control and i.v. iron groups. This is in contrast to reports from the IRONMAN trial, which showed trends toward fewer infection-related events with i.v. ferric derisomaltose (a pre-specified endpoint)23,24.

Some limitations to this analysis should be considered. The absence of individual participant data from the IRONMAN trial limited the ability to adjust for covariates for additional subgroup analyses. Also, there was heterogeneity among trials in terms of i.v. iron formulations, dose used, whether patients were enrolled in or out of hospital and national differences in characteristics or healthcare services that might affect hospitalization rates. However, showing similar effects across diverse populations might also be considered a strength of this analysis. Last, two of the six trials included (FAIR-HF and CONFIRM) did not have a clinical outcome as the primary endpoint and the findings from these trials may be uncertain, given the wide CIs.

In conclusion, the totality of evidence suggests that treating iron deficiency in patients with HFrEF with i.v. iron significantly reduces the composite outcome of recurrent HF hospitalizations and cardiovascular mortality, which reflected reductions in both components, but particularly HF hospitalizations. Treatment effects were greatest in the first year after randomization and consistent across various subgroups, including baseline TSAT. Further research is needed to confirm whether women obtain less benefit from i.v. iron and, if so, why.

Methods

This meta-analysis adheres to Preferred Reporting Items Systematic Reviews and Meta-Analyses (PRISMA) recommendations25. Ethical committee approval was not required because all analyses were based on existing data. The protocol was registered in PROSPERO before data extraction and analysis (registration no. CRD42025635165).

Data sources and search strategy

A comprehensive search of MEDLINE and Scopus was conducted, without language restrictions, from the inception of these databases through the first week of January 2025, by two independent investigators (M.S.K. and K.M.T.). The detailed search strategy is provided in Supplementary Table 2. To ensure that no relevant publications were overlooked, the search was supplemented with a review of ClinicalTrials.gov and references in recent reviews and meta-analyses. All retrieved articles were imported into Endnote X7 (Clarivate Analytics) to identify and remove duplicates. Titles and abstracts were initially screened, followed by a full-text review to confirm eligibility. The two independent reviewers (M.S.K. and K.M.T.) evaluated the studies, with any disagreements resolved through discussion with a third reviewer (S.D.A.).

Inclusion criteria

Randomized trials comparing i.v. iron with placebo or standard or usual care in adults with HF, iron deficiency and a left ventricular ejection fraction (LVEF) ≤50% reporting HF hospitalizations and mortality that enrolled ≥200 patients and lasted ≥24 weeks were included.

Data extraction and risk-of-bias assessment

Relevant data were extracted into an Excel spreadsheet. Risk of bias was evaluated by two authors (M.S.K. and K.M.T.) using the Cochrane risk-of-bias tool26, focusing on random sequence generation, allocation concealment, blinding of participants or personnel and outcomes, completeness of outcome data and selective reporting. Each trial was classified as having a low, high or unclear risk of bias for each domain.

Outcomes and subgroups

The primary endpoint was the composite of recurrent HF hospitalizations (total events) or cardiovascular death (1) within 12 months of randomization and (2) during the entire follow-up. A composite of recurrent HF hospitalizations or cardiovascular mortality was chosen as the primary endpoint because it was the primary endpoint for AFFIRM-AHF, IRONMAN and FAIR-HF2, a key secondary endpoint in HEART-FID, as well as for several previous meta-analyses. It is also worth noting that the same or similar primary endpoint has been used in many other large HF trials that have shaped international practice and guidelines. Key secondary endpoints for this analysis included total HF hospitalizations, cardiovascular mortality and all-cause mortality (1) within 12 months of randomization and (2) during the entire follow-up period. Safety endpoints included serious adverse events or hospitalizations resulting from infections within 12 months of randomization and over the entire follow-up period.

We obtained patient-level data for FAIR-HF, FAIR-HF2, CONFIRM-HF, AFFIRM-AHF and HEART-FID trials and applied the analysis methods and subgroup definitions of the IRONMAN trial, for which only trial-level data are available at this time. The subgroup analyses for the primary endpoint focused on sex, age (<69.4 versus ≥69.4 years), etiology of HF (ischemic versus nonischemic), TSAT (<20% versus ≥20%), eGFR (calculated using the Chronic Kidney Disease Epidemiology Collaboration equation, ≤60 versus >60 ml min−1 1.73 m−2), hemoglobin (<11.8 versus ≥11.8 g dl−1), ferritin (<35 versus ≥35 μg l−1) and NYHA class (II versus III + IV). The cut-offs for these subgroups were taken from the IRONMAN subgroup analyses. Outcomes at 12 months for the IRONMAN trial were extracted from the IRONMAN publication’s Supplementary Material Table S3 (ref. 7).

Statistical analysis

Using the same statistical methods as the IRONMAN trial, FAIR-HF, CONFIRM-HF, AFFIRM-AHF, HEART-FID and FAIR-HF2 trials were re-analyzed using the Lin–Wei–Yang–Ying model for (1) the composite outcome of total (first and recurrent) HF hospitalizations and cardiovascular death and (2) total HF hospitalizations alone27. The primary analyses in IRONMAN had been adjusted for recruitment context (hospital admission or outpatient). As all the other trials were conducted in either of these contexts, the analyses were not adjusted for recruitment context. However, re-analyses were adjusted for region because they were conducted internationally, whereas IRONMAN was conducted in the United Kingdom only. Time-to-event analyses utilized Cox’s proportional hazards regressions extracted from the publications.

Random-effects meta-analyses were conducted on aggregated data using the normal–normal hierarchical model (NNHM) within a Bayesian framework28. This approach, in contrast to frequentist meta-analysis, treats both data and model parameters as random variables, incorporates prior distributions, accounts for uncertainty in estimating between-trial heterogeneity and allows sensitivity analyses by adjusting distributional assumptions and incorporating prior knowledge. Measures of effect included HRs for time-to-event outcomes (for example, cardiovascular and all-cause mortality) and RRs for recurrent events (for example, total HF hospitalizations with or without cardiovascular death). A weakly informative prior for between-trial heterogeneity (τ), specifically an HN prior with a scale of 0.5, was applied29, whereas uninformative priors were used for treatment and interaction effects. Sensitivity analyses using alternative priors such as 0.1 and 1.0 were also conducted to further strengthen the methodological rigor. We also conducted a sensitivity analysis by excluding FAIR-HF and CONFIRM-HF trials because they were relatively smaller trials focusing on exercise capacity and symptoms. Results were summarized by marginal posterior medians of the log(RR), log(HR), log(RRR) and the between-trial heterogeneity τ. Between-trial heterogeneity was visualized in Forest plots. Bayesian meta-analyses were conducted using the R package bayesmeta30. As supporting analyses, frequentist analyses were also performed, using the Knapp–Hartung approach to random-effects meta-analysis with the Paule–Mandel estimator for the between-trial heterogeneity31,32. P values were reported for the frequentist meta-analyses. The closest equivalent to P values that one may compute from a Bayesian analysis is the corresponding PB which was reported. All analyses were performed using R (v.4.4 or higher) or SAS (v.9.4 or higher).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.