Introduction

Chronic heart failure (CHF), a complex clinical syndrome, arises from myriad etiologies that precipitate pathological alterations in cardiac structure and function. These alterations compromise ventricular systolic and/or diastolic function, culminating predominantly in symptoms such as dyspnea, fatigue, and fluid retention, manifesting as pulmonary and systemic circulatory congestion and peripheral edema1. CHF is recognized as the most severe and advanced manifestation of cardiovascular diseases2. It afflicts approximately 26 million adults worldwide, with the United States reporting about 5.7 million cases alone. Projections suggest a rise to over 8 million cases by 20303. Marked by elevated rates of hospitalization and mortality, diminished quality of life, and considerable economic impacts, CHF poses a significant public health challenge both in China and globally4.

While current pharmacological interventions form the foundation of CHF management—predominantly focusing on acute care and maintenance therapy during stable periods—these strategies often neglect aspects such as secondary prevention and long-term prognosis of CHF patients. A wealth of research underscores that CR, with its emphasis on structured exercise programs, can markedly bolster exercise tolerance, enhance the quality of life, alleviate depressive symptoms, and diminish the likelihood of rehospitalization, thus significantly ameliorating clinical outcomes5,6.

CR is not a singular intervention, but a comprehensive, tiered strategy for patients with cardiovascular diseases (CVD). It is a complex intervention that involves the collaborative efforts of multidisciplinary teams, addressing physiological, psychological, and social dimensions. The goal is to improve the overall quality of life for patients, particularly after the acute phases of their conditions. The essential elements of CR include comprehensive health assessments, tailored exercise regimens, pharmacological management, dietary counseling, psychological support, smoking cessation initiatives, management of concomitant risk factors, and educational activities to enhance patient adherence to therapeutic plans7. CR is a crucial component of non-pharmacological management for CHF, as recommended by both American and European guidelines for CHF patients8, and supported by the American College of Cardiology Foundation/American Heart Association Task Force and the European Society of Cardiology9. CHF patients should initiate CR in Phase I to optimize recovery and improve long-term outcomes.

CR programs are offered in various formats, including center-based, home-based, tele-based, and combined models, all proven effective in reducing cardiovascular mortality, morbidity, and hospitalization rates10,11,12. Nevertheless, previous meta-analyses have indicated that hospital-based CR programs generally exhibit lower effectiveness and adherence than home-based, tele-based, and combined models13,14,15. While some studies have evaluated the effectiveness of different CR delivery models and provided valuable insights in specific contexts, the relative efficacy of these models across various patient populations, healthcare settings, and intervention protocols remains heterogeneous. Therefore, further comprehensive comparisons and multidimensional analyses are crucial to clarify the relative effectiveness of these delivery methods. To address this gap, we conducted a network meta-analysis (NMA) aimed at comprehensively assessing and ranking the effectiveness of center-based, home-based, tele-based, and combined CR delivery models, with a focus on determining the most effective exercise modalities for patients with CHF.

Methods

This study rigorously adheres to the PRISMA-NMA guidelines for reporting NMA. The comprehensive study protocol, including detailed research strategies, has been registered on the PROSPERO platform (registration number CRD42024517039). The detailed PRISMA checklist is provided as Appendix 1.

Study types

The study encompasses randomized controlled trials (RCTs) on exercise-based CR that were published in English. These trials, irrespective of blinding or allocation concealment, included complete patient information.

Types of participants

The analysis included RCTs involving adults (≥ 18 years) with CHF, comparing different forms of exercise-based CR. Patients were categorized into three heart failure subtypes: reduced ejection fraction (HFrEF), preserved ejection fraction (HFpEF), and mid-range ejection fraction (HFmrEF). Consequently, the study population comprised adults diagnosed with one of these three subtypes of heart failure.

Types of interventions

In this NMA, various types of exercise-based CR are examined, either against each other or against standard care as part of the treatment. Exercise-based CR is categorized into center-based, home-based, tele-based, and combined interventions. Center-based CR(CBCR) is defined as CR conducted in a hospital or a similarly equipped facility. Home-based CR(HBCR) refers to interventions at the patient’s home or other non-hospital settings (e.g., community centers) using traditional follow-up methods such as telephone calls or regular visits. Tele-based CR(CTR) is performed at the patient’s home or outside hospital settings, monitored, and guided by health professionals using telemedicine technologies. Combined CR(HCR) refers to either (1) a combination of center and home-based CR or (2) center and tele-based CR. Patients receiving standard care are expected to maintain daily activities without CR interventions. The exercise interventions within center-based, home-based, tele-based, and combined CR included aerobic exercise (AE), high-intensity interval training (HIIT), and a combination of aerobic and resistance training (AE + RE), which were compared in this analysis.

Types of outcome

Primary outcomes include peak oxygen uptake(Peak O2), left ventricular ejection fraction (LVEF), left ventricular end-diastolic diameter (LVEDD), and 6-min walk test (6MWT). Secondary outcomes include the Minnesota Living with Heart Failure quality of life(MLHFQ), and rehospitalization rates.

Exclusion criteria

Our study excludes non-RCTs such as conference abstracts, reviews (including systematic reviews), editorials, observational studies, animal studies, and pediatric trials. Studies deviating from the principles of randomization or lacking robust experimental design and those comparing the effectiveness of CR across different training periods are also excluded. Finally, studies with outcome measures inappropriate for this NMA are omitted.

Search strategy

We systematically searched the following databases: Pubmed, Embase, Cochrane Central Register of Controlled Trials, and Web of Science. The search covered records from the inception of each database until August 2024. References of included studies were also retrospectively searched to supplement the search. The strategy adhered strictly to the PICOS principles, using keywords such as "heart failure," "cardiac failure," "heart decompensation," "cardiac rehabilitation," "exercises," "physical activity," "RCT," and "randomized controlled trial." These terms were combined using Boolean operators “and” and "or." The search strategy is outlined in Appendix 2.

Data extraction and screening

Duplicate studies were removed using EndNote, and two researchers initially screened potentially relevant citations based on titles and abstracts. Full texts were then evaluated based on inclusion and exclusion criteria. In cases of uncertainty, another researcher assessed the citations, and consensus determined inclusion. A detailed reading of the included studies facilitated data extraction into a predefined form, covering: (1) the first author’s name, publication year, methods, and country; (2) baseline characteristics of the patient groups (e.g., age); (3) mode of CR delivery; (4) sample size; (5) outcome measures. Data extraction was independently performed by authors who read each article in full.

Bias risk and quality assessment

We defined low overall risk of bias as low risk in the domains of randomization, blinding of outcome assessors, completeness of outcome data, and selective reporting without other domains at high risk. Unclear overall risk of bias is defined as having at least one unclear risk of bias in the domains of randomization, blinding, completeness of outcome data, or selective reporting, without high risk in any other domains. A high overall risk of bias is defined as having a high risk in at least one of these domains: randomization, blinding of outcome assessors, completeness of outcome data, or selective reporting. This stringent definition of bias risk was applied to ensure the credibility and trustworthiness of our research findings.

According to the Cochrane Handbook 5.1.016, the quality of our studies was assessed using the risk of bias tool for randomized controlled trials. Two researchers independently conducted the evaluations, with any disagreements resolved by a third reviewer. The evaluation covered key aspects such as the generation of random sequences, implementation of allocation concealment, blinding of participants, researchers, and outcome assessors, completeness of outcome data, selective reporting, and other biases. Each category was assessed as "low risk of bias," "unclear," or "high risk of bias." The assessment was graded on three levels: all entries assessed as "low risk of bias" (Grade A) indicated a lower likelihood of bias; any entries assessed as "high risk of bias" (Grade C) suggested a higher likelihood of bias; if some criteria were met, the risk of potential bias was considered moderate (Grade B).

Statistical analysis

Statistical analysis used StataSE 17.0 to perform network meta-analyses and generate network graphs, forest plots, funnel plots, league tables, and Surface Under the Cumulative Ranking Curve (SUCRA). The primary outcome measures, including peak oxygen uptake, ejection fraction, left ventricular end-diastolic diameter (LVEDD), 6-min walk distance, and Minnesota Living with Heart Failure quality of life, were analyzed as continuous variables, extracting mean differences and standard deviations post-intervention. The rehospitalization rates were analyzed as dichotomous variables.

In direct meta-analyses comparing pairs of interventions, heterogeneity was assessed using the I2 statistic. An I2 value ≤ 50% indicated no significant statistical heterogeneity, justifying a fixed-effect model. Conversely, an I2 value > 50% indicated significant heterogeneity, necessitating a random-effects model. For network meta-analysis, where each outcome measure forms a closed loop, it is imperative to perform a loop inconsistency test via node-splitting methods. This test compares direct and indirect estimates; a P-value < 0.05 suggests inconsistency between direct and indirect comparisons, warranting an inconsistency model, whereas a P-value > 0.05 supports using a consistency model.

The loop-specific approach evaluated inconsistency within loops, comparing direct and indirect estimates for each comparison. Node-splitting techniques were utilized to decompose the overall estimate at each node into its direct and indirect components. Sensitivity analyses were conducted for the primary and other available secondary outcomes: (1) including only studies with a low overall risk of bias; (2) excluding studies with a high overall risk of bias. Utilizing the network package, a network meta-analysis is conducted, producing SUCRA graphs to determine the relative efficacy of various types of exercise-based CR across different evaluation metrics. Subsequently, funnel plots are employed to detect publication bias within the incorporated literature and to perform bias analysis leveraging the obtained outcomes.

Results

Study selection

A total of 9,522 studies were identified and screened. After removing duplicates, 5,886 records were screened based on titles and abstracts, and 3,666 full-text articles were retained. A full-text review excluded 1,890 articles that did not meet the inclusion criteria, such as case reports, reviews, commentaries, animal studies, or poorly designed randomized controlled trials. An additional 328 studies were excluded for having non-conforming interventions, 468 for inadequate study design, 444 for lacking relevant outcomes, and 406 for not meeting intervention criteria. Ultimately, 33 studies were included. The detailed literature selection process is shown in Appendix 4 Fig. 4.1.

Study characteristics

Eight studies were conducted in the United States and five in China. Additional studies originated from Poland, Australia, Germany, the United Kingdom, Italy, Belgium, Canada, Switzerland, Finland, Greece, Athens, Turkey, Iran, Norway, and Japan. Of these, Two studies did not report the ages of participants. A total of 26 studies were single-center trials, and 7 were multicenter trials. The analysis included two studies on patients with HFmrEF, 29 studies on HFrEF, and two studies on HFpEF. The total sample size across all included studies was 5,900 participants, divided into 2,976 in the experimental groups and 2,924 in the control groups. All studies were dual-arm, with comparable baseline data of the patients. The basic characteristics of the included studies are presented in Appendix 3 Table 3.1.

Risk of bias of included studies

The duration of the individual CR programs varied: center-based programs averaged 3.375 months; home-based programs averaged 4.54 months; tele-based programs averaged 2.89 months; and combined programs averaged 5 months. The overall risk of bias for the included RCTs is detailed in Appendix 3 Table 3.2. A total of 33 RCTs were included, with 2 studies reporting random sequence generation using random number tables as their method17,18, and 28 studies reported using general randomization methods19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47, all considered at low risk of bias. One study was identified as using non-randomized group assignments26,30,32,42,45,46 and categorized as high risk due to the lack of randomization. Most interventions were non-blinded due to the practical requirements of exercise participation and the need to instruct patients, although a few studies implemented blinding26,30,32,38,44,46,47. (Appendix 4:Fig. 4.2 and 4.3). We analyzed heterogeneity at different intensities (high, medium, low) and frequencies (< 3 times per week, 3–5 times per week, > 5 times per week), which ranged from low to moderate. For details, please refer to Appendix 5 Table 5.1

Evaluation of consistency and heterogeneity

The I2 result did not identify any high heterogeneity in the network, and most comparisons were low or low-moderate levels of heterogeneity (Appendix 5 Table 5.2).

Synthesis of results

Network plot

This study included a total of 33 studies. Of these, 23 studies, all two-arm trials, reported 6MWT with two closed loops. Nine studies, all two-arm trials, reported LVEF and had one closed loop. Twenty studies, all two-arm trials, reported peak VO2 and had one closed loop. This study constructed an evidence network diagram encompassing all research indicators, as Appendix 4 Fig. 4.4 (A-F) detailed. In this network diagram, blue circles represent different interventions, with their size indicating the sample size included for each intervention. The lines connecting them represent studies that directly compared two interventions, and the thickness of these lines corresponds to the number of studies involving that comparison.

Loop inconsistency test

6MWT had two closed loops, LVEF and Peak VO2 each having one closed loop, and testing revealed that the p-values were all greater than 0.05, indicating good consistency within the closed loops. Thus, a consistency model was deemed appropriate for conducting network meta-analyses (Appendix 3:Table 3.3).

Network meta-analysis

MLHFQ

Among the 12 studies comparing five different interventions in the MLHFQ, all were dual-arm trials. The evidence network is shown in Appendix 4 Fig. 4.4 -A. Among all the interventions, HCR(AE + RE) demonstrated the most significant improvement in MLHFQ scores. The 95%CI did not include 1 in all cases, indicating statistically significant differences. The remaining pairwise comparisons had 95% CIs that included 1, suggesting no significant differences (see Appendix 4 Fig. 4.5-A). As the MLHFQ is a negative outcome measure, a lower SUCRA score indicates a more effective intervention. The ranking of interventions from most to least effective is as follows: HCR(AE + RE) (SUCRA = 100%) > HBCR(AE) (SUCRA = 71.4%) > CBCR(HIIT) (SUCRA = 70.9%) > CBCR(AE) (SUCRA = 47.1%) > CTR(AE) (SUCRA = 44%) > HBCR(AE + RE) (SUCRA = 37.6%) > CBCR(AE + RE) (SUCRA = 16.3%) > UC (SUCRA = 12.7%), as illustrated in Appendix 4 Fig. 4.6-A.

6MWT

A total of 23 studies compared eleven interventions in terms of differences in the 6MWT, all of which were dual-arm trials. The evidence network is illustrated in Appendix 4 Fig. 4.4 -B. Compared to usual care, HCR(AE + RE) [RR = 50.29, 95% CI (0.91, 100.09)] (SUCRA = 100%) was more effective in improving 6MWT performance, with all 95%CI excluding 1, indicating statistically significant differences. The remaining pairwise comparisons had 95% CIs that included 1, suggesting no significant differences (see Appendix 4 Fig. 4.5-B,4.6-B).

LVEF

Among the 26 studies, differences in LVEF were compared across five different intervention strategies, as shown in the evidence network diagram in Appendix 4 Fig. 4.4 -C, all of which were dual-arm studies. CBCR (HIIT) was the most effective intervention for improving LVEF among all treatments (SUCRA = 100%). None of the pairwise comparisons revealed statistically significant differences, as all 95%CIs included 1, as depicted in Appendix 4 Fig. 4.5-C,4.6-C.

LVEDD

Among the 3 studies, differences in LVEDD were compared across four different intervention strategies, as shown in the evidence network diagram in Appendix 4 Fig. 4.4 -D. All studies were dual-arm. For the pairwise comparisons, the 95% confidence intervals include 1, suggesting no statistically significant differences. These comparisons are depicted in Appendix 4 Fig. 4.5-D,4.6-D.

Peak VO2

Among the 20 studies, differences in Peak VO2 across five different intervention strategies were compared, as shown in the evidence network diagram in Appendix 4 Fig. 4.4 -E. All studies were dual-arm. CBCR(AE + RE) [RR = 2.46, 95% CI (1.03,5.89)] and HBCR(AE) [RR = 1.89, 95% CI (1.10,3.28)] both outperformed usual care, while CBCR(AE) demonstrated the best outcomes [RR = 3.64, 95% CI (1.66,7.95)]. The 95% confidence intervals do not include 1, indicating statistical significance. For the remaining pairwise comparisons, the 95% confidence intervals include 1, suggesting no statistically significant differences. These results are depicted in Appendix 4 Fig. 4.5-E. Peak VO2 is a positive outcome indicator, so a higher SUCRA score signifies better intervention efficacy. The ranking of interventions from most to least effective is as follows: CBCR(AE) (SUCRA = 80.6%) > CBCR(AE + RE) (SUCRA = 60.3%) > HBCR(AE) (SUCRA = 46.4%) > UC (SUCRA = 10.3%). This order is displayed in Appendix 4 Fig. 4.6-E.

Readmission rates

Among the 7 studies, differences in Readmission Rates across six different intervention strategies were compared, as shown in the evidence network diagram in Appendix 4 Fig. 4.4 -F. All studies were dual-arm. HCR (AE + RE) was the most effective intervention for reducing readmission rates among all treatments (SUCRA = 96%). The 95% confidence intervals do not include 1, indicating statistical significance. For the remaining pairwise comparisons, the 95% confidence intervals include 1, suggesting no statistically significant differences. These results are depicted in Appendix 4 Fig. 4.5-F,4.6-F.

Publication bias assessment

A funnel plot was generated using StataSE 17.0 software, with the horizontal axis representing the effect size of each study and the vertical axis representing the standard error. The funnel plot is a commonly used tool for detecting bias among studies included in a meta-analysis. Each point on the plot represents a pairwise comparison from one of the studies. Most studies display symmetrical distribution around the zero line, with a few scattered outside the funnel plot (except for Fig. 7-D,7-F), indicating the presence of some publication bias. Some scattered points lie close to the x-axis, suggesting a small sample size. Appendix 4 Fig. 4.7.

Meta-regressions

We also used meta-regression analysis to assess the potential impact of baseline factors on the primary outcomes. HFrEF, HFpEF, and HFmrEF were evaluated, with results showing no significant influence on the main outcomes (Appendix 6).

Discussion

Our study classifies exercise-based CR into different approaches based on location and exercise type. CR interventions are categorized as CBCR, HBCR, CTR, or HCR, and further divided by exercise modality: AE, AE + RE, and HIIT. This classification aims to determine the most effective treatment options for patients.

In this network meta-analysis, we evaluated 11 exercise modalities across 33 randomized controlled trials involving 5,900 participants with CHF. The primary outcomes assessed included MLHFQ, 6MWT, LVEF, LVEDD, Peak VO2, and readmission rates. The analysis showed that HCR(AE + RE) significantly improved MLHFQ scores, enhanced 6MWT performance, and reduced readmission rates, all supported by high-quality evidence. CBCR(HIIT) was the most effective in improving LVEF and overall cardiac function, while CBCR(AE) significantly enhanced Peak VO2. Although the efficacy of these CR delivery modes varied, all interventions demonstrated benefits, aligning with guidelines recommending exercise-based rehabilitation as a safe and effective treatment for heart failure, improving exercise capacity, cardiopulmonary function, and quality of life48.

HCR(AE + RE) emerged as a particularly effective intervention, improving MLHFQ scores, 6MWT, and reducing readmissions. Tailored combined CR programs offer a flexible and comprehensive approach to CHF care, overcoming limitations of hospital-based CR such as space constraints, travel difficulties, staff shortages, and high costs. Participants benefit from receiving professional guidance at the hospital while maintaining the flexibility to continue CR at home, increasing both participation and completion rates. Studies suggest that participants prefer this combined approach due to its adaptability49. Furthermore, combined CR is cost-effective, accounting for only 38% of the cost of hospital-based programs50. Previous studies have shown that combined CR significantly reduces risk factors, improves physical fitness, and facilitates a return to work, making it an ideal option for patients with preserved HFpEF or HFmrEF51,52,53.

Improving QoL is a key goal in CHF treatment, as it is a strong predictor of reduced hospitalization and mortality54. QoL reflects the broader impact of CHF on physical activity, psychological health, and social interactions. Using MLHFQ, our study found that combined CR significantly reduced MLHFQ scores and readmission rates, thereby improving QoL. HCR(AE + RE) further enhanced 6MWT performance, consistent with prior findings55.

CBCR (HIIT) was the most effective intervention for improving LVEF and cardiac function. High-intensity interval training enhances skeletal muscle mitochondrial content and metabolic function, boosts cardiac output, and strengthens heart function56. CBCR, particularly in hospital settings, maximizes improvements in LVEF, enhances exercise capacity, and alleviates heart failure symptoms. Its safety and reliable supervision make it an ideal choice for patients with HFrEF.

CBCR(AE) also showed significant improvements in Peak VO2. Aerobic exercise increases exercise tolerance and improves Peak VO2 in CHF patients, a finding confirmed by our network meta-analysis. The mechanism likely involves CR improving endothelial function and myocardial blood flow, increasing oxygen delivery to the heart, and enhancing exercise capacity in CHF patients57.

Strengths and limitations

This study is the first to use NMA to assess the impact of different exercise-based CR delivery methods—center-based, home-based, tele-based, combined CR, and standard care—on cardiac function, exercise capacity, and QoL in heart failure patients. We stratified the study population into three heart failure subtypes: HFrEF, HFpEF, and HFmrEF. Additionally, we compared various exercise modalities within each CR delivery method, including aerobic exercise (AE), high-intensity interval training (HIIT), and aerobic exercise combined with resistance training (AE + RE). Our analysis included studies from 16 countries, ensuring a wide geographic representation and consistency in baseline characteristics. The inclusion criteria focused on interventions strictly following established CR delivery models, excluding control groups with additional exercise interventions beyond standard care. Studies with incomplete outcome data were excluded to improve the reliability of our findings.

This NMA provides a comprehensive review of all RCTs related to exercise-based CR delivery methods. We applied rigorous quality assessments to ensure the strength of our conclusions. However, the study has certain limitations. Reporting on readmission rates was limited, with only six RCTs providing data on this outcome. This smaller sample size restricted our ability to fully evaluate the effects of different CR delivery methods on readmission rates.

Conclusion

Our NMA identified HCR(AE + RE) as the most effective intervention for improving MLHFQ scores, increasing 6MWT performance, and reducing readmission rates. CBCR(HIIT) proved most effective in enhancing LVEF and improving cardiac function in heart failure patients, while CBCR(AE) significantly increased Peak VO2. Despite these findings, data on readmission rates and adverse cardiovascular events are limited. Future research should prioritize large-scale, multicenter, high-quality randomized controlled trials to build a stronger evidence base for exercise-based CR delivery methods.