Introduction

The prevalence of heart failure (HF) with preserved ejection fraction (HFpEF) is increasing1,2,3,4. HFpEF accounts for approximately 50% of all patients with HF, with > 4% of the general population aged ≥ 60 years identified as having HFpEF3,5,6. However, diagnosing HFpEF remains challenging. There is no consensus on how to define HFpEF, with wide variation in diagnostic criteria across various society guidelines and clinical trials5,7,8,9,10. This lack of uniformity in disease definition may stem partly from an incomplete understanding of the pathobiology and phenotypic heterogeneity of HFpEF11. Invasive or exercise hemodynamic assessments have emerged as the gold standard for identifying HFpEF. However, their high cost, procedural risk, and technical complexity make them impractical for routine evaluation and preclude their use in large clinical trials7,12,13.

The HFA-PEFF and H2FPEF scores are two recently proposed, score-based algorithms designed to aid in the diagnosis of HFpEF14,15. The H2FPEF score was derived from patients with unexplained dyspnea who were referred for invasive hemodynamic exercise testing and includes four clinical and two echocardiographic items14. More recently, the Heart Failure Association of the European Society of Cardiology (ESC) published a separate diagnostic algorithm, the HFA-PEFF score, which contains three domains: functional, morphological, and biomarker15. Additionally, the 2021 ESC HF guideline diagnoses HFpEF by incorporating symptoms and signs of HF, echocardiographic parameters, and natriuretic peptides (NP); this approach does not represent new diagnostic score but rather a simplified approach8.

Several recent studies have evaluated the diagnostic performance of the HFA-PEFF and H2FPEF scores in previous HFpEF trials and registries, demonstrating the generalizability and validity of these scores16,17,18,19,20,21,22. However, some studies found limited sensitivities of these two scores for the diagnosis of HFpEF17,21 and significant discrepancies in classifying patients with suspected HFpEF23. Additionally, discordance between the ESC HF guideline algorithm and HFA-PEFF and H2FPEF scores in diagnosing HFpEF has been reported24. Therefore, we aimed to evaluate HFpEF prevalence using the three different algorithms in patients with unexplained dyspnea and to compare their concordance in diagnosing HFpEF (Fig. 1).

Fig. 1
figure 1

Patients enrollment algorithm. According to the 2021 ESC HF guideline, patients who had symptoms of HF and LVEF ≥ 50%, along with both elevated NT-proBNP and echocardiographic markers of LV diastolic dysfunction or raised LV filling pressure, were classified as having confirmed HFpEF. Patients meeting one of these criteria were classified as having suspected HFpEF, while those meeting neither were classified as having no HFpEF. Based on the HFA-PEFF or H2FPEF scores, patients with low, intermediate, or high scores were classified as having no HFpEF, suspected HFpEF, or confirmed HFpEF, respectively. *Symptoms of HF: dyspnea on exertion, paroxysmal nocturnal dyspnea, or orthopnea. Significant valvular heart disease: greater than mild stenosis, greater than moderate regurgitation, or valve surgery. Primary cardiomyopathy: hypertrophic cardiomyopathy, restrictive cardiomyopathy, or arrhythmogenic right ventricular cardiomyopathy. CPET cardiopulmonary exercise testing, ESC European Society of Cardiology, HF heart failure, HFpEF heart failure with preserved ejection fraction, LV left ventricular, LVEF left ventricular ejection fraction, NT-proBNP N-terminal pro-B type natriuretic peptide.

Results

Baseline characteristics according to three HFpEF diagnostic algorithms

Among the 992 patients with unexplained dyspnea suspected of having HFpEF, the mean age was 60.9 years, and 63.7% were female (Table 1). The overall prevalence rates of hypertension, diabetes mellitus, and atrial fibrillation (AF) were 72.6%, 21.6%, and 6.6%, respectively. The mean body mass index was 24.6 kg/m2 and mean systolic blood pressure was 122.6 mmHg. The median N-terminal pro-B-type natriuretic peptide (NT-proBNP) level was 71 pg/mL and mean number of antihypertensive medications was 1.3.

Table 1 Baseline characteristics of patients according to three HFpEF diagnostic algorithms.

According to the 2021 ESC HF guideline, 724 (73.0%) patients had an echocardiographic marker of left ventricular (LV) diastolic dysfunction or raised LV filling pressure, and 307 (30.9%) had elevated NT-proBNP levels (Fig. 2A). Thus, 283 (28.5%), 465 (46.9%), and 244 (24.6%) patients were classified as having confirmed HFpEF, suspected HFpEF, and no HFpEF, respectively, according to the 2021 ESC HF guideline (Fig. 2B). According to the HFA-PEFF and H2FPEF scores, 209 (21.1%), 565 (57.0%), and 218 (22.0%) patients and 47 (4.7%), 574 (57.9%), and 371 (37.4%) patients were classified as having confirmed HFpEF, suspected HFpEF, and no HFpEF, respectively. The baseline characteristics of patients with confirmed HFpEF using the three different diagnostic algorithms are shown in Table 1. Hypertension and AF prevalence, number of antihypertensive medications, NT-proBNP, left atrial volume index (LAVI), e’ velocity, E/e’, peak tricuspid regurgitation (TR) velocity, and pulmonary artery systolic pressure (PASP) were significantly different among the three confirmed HFpEF groups according to each diagnostic algorithm, with the highest values observed in those classified as confirmed HFpEF according to the H2FPEF score.

Fig. 2
figure 2

Prevalence of HFpEF according to the 2021 ESC HF guideline, HFA-PEFF score, and H2FPEF score in patients with unexplained dyspnea. (A) Prevalence of elevated NT-proBNP and echocardiographic marker of LV diastolic dysfunction/raised LV filling pressure according to 2021 ESC HF guideline, (B) prevalence of HFpEF according to three categories. AF atrial fibrillation, ESC European Society of Cardiology, HF heart failure, HFpEF heart failure with preserved ejection fraction, LV left ventricular, SR sinus rhythm.

Reclassification of HFpEF according to 2021 ESC HF guideline, HFA-PEFF, and H2FPEF scores

The reclassification of patients into likelihood categories (low, intermediate, and high) from the HFA-PEFF score to the H2FPEF score, or vice versa, is presented in Fig. 3; Table 2. Among the 209 patients with confirmed HFpEF according to the HFA-PEFF score, only 35 (16.7%) were reclassified as having confirmed HFpEF according to the H2FPEF score, whereas 158 (75.6%) and 16 (7.7%) were reclassified as having suspected and no HFpEF, respectively. Conversely, among the 47 patients with confirmed HFpEF according to the H2FPEF score, 35 (74.5%) and 12 (25.5%) were recategorized as having confirmed and suspected HFpEF according to the H2FPEF score, respectively. Among the 565 patients with suspected HFpEF based on the HFA-PEFF score, 175 (31.0%) were reclassified as having no HFpEF based on the H2FPEF score. Among 371 patients with no HFpEF, 191 (51.5%) were categorized as having confirmed HFpEF or suspected HFpEF according to the H2FPEF score. In summary, 399 (40.2%) patients were classified into different likelihood categories, depending on the score used.

Fig. 3
figure 3

Reclassification of patients into likelihood categories (low, intermediate, and high) from the HFA-PEFF to the H2FPEF score (A) and vice versa (B). White, low likelihood (no HFpEF); light gray, intermediate likelihood (suspected HFpEF); dark gray, high likelihood of HFpEF (confirmed HFpEF), estimated by each score. HFpEF heart failure with preserved ejection fraction.

Table 2 Reclassification table of HFpEF prevalence using 2021 ESC HF guideline and the HFA-PEFF and H2FPEF scores.

Table 2 shows the reclassification and concordance of the HFpEF prevalence using the three diagnostic algorithms. The concordance between the HFA-PEFF and H2FPEF scores, when comparing the three HFpEF categories, was moderate (Cohen’s weighted kappa = 0.47). The concordance between the 2021 ESC HF guideline and HFA-PEFF scores was good (Cohen’s weighted kappa = 0.77), while the concordance between the 2021 ESC HF guideline and H2FPEF scores was moderate (Cohen’s weighted kappa = 0.45).

The distribution of H2FPEF scores according to different body mass index (BMI) cut-off values is shown in Fig. S1. Among the total of 992 patients, 72 (7.3%), 234 (23.6%), and 420 (42.3%) had BMI of ≥ 30, ≥27, or ≥ 25 kg/m2, respectively. Using a lower BMI cut-off, the prevalence of confirmed HFpEF and suspected HFpEF according to the H2FPEF score revealed a slightly increasing trend, while the prevalence of no HFpEF revealed a decreasing trend. The concordance between the H2FPEF and HFA-PEFF scores and the H2FPEF score and the 2021 ESC HF guideline at the BMI cut-off of 27–25 kg/m2 was moderate (Cohen’s weighted kappa = 0.49 and 0.47 for 27 kg/m2 and 0.48 and 0.44 for 25 kg/m2, respectively), indicating that concordance did not significantly change with different BMI cut-offs.

Concordance of three HFpEF diagnostic algorithms

The prevalence and concordance of confirmed HFpEF in patients with unexplained dyspnea, according to the three HFpEF diagnostic algorithms are shown in Fig. 4A. Among the 992 patients with unexplained dyspnea, the prevalence of confirmed HFpEF verified: 28.5% (n = 283) based on the 2021 ESC HF guideline, 21.1% (n = 209) based on the HFA-PEFF, and 4.7% (n = 47) according to the H2FPEF. The concordance of confirmed HFpEF was highest between the 2021 ESC HF guideline and HFA-PEFF score, reflected by Cohen’s unweighted kappa coefficient of 0.80 (good) and a reclassification rate of 7.3%. In contrast, the concordance between 2021 ESC guideline and H2FPEF score, as well as between the HFA-PEFF score and H2FPEF score, was poor and fair, with Cohen’s unweighted kappa coefficient of 0.15 and 0.21, respectively. The overlap of participants diagnosed with confirmed HFpEF according to the three algorithms was limited to 3.5%.

Fig. 4
figure 4

Prevalence and concordance of HFpEF according to the three HFpEF algorithms in patients with unexplained dyspnea. (A) Of the 992 patients with unexplained dyspnea, the prevalence of confirmed HFpEF ranged from 28.5% (n = 283, 2021 ESC HF guideline) to 21.1% (n = 209, HFA-PEFF) and 4.7% (n = 47, H2FPEF). (B) The prevalence of combined confirmed HFpEF and suspected HFpEF ranged from 78.0% (n = 774, HFA-PEFF) to 75.4% (n = 749, 2021 ESC HF guideline) and 62.6% (n = 621, H2FPEF). ESC European Society of Cardiology, HF heart failure, HFpEF heart failure with preserved ejection fraction.

The prevalence of combined confirmed and suspected HFpEF was 78.0% (n = 774) for HFA-PEFF, 75.4% (n = 749) for the 2021 ESC HF guideline, and 62.6% (n = 621) for the H2FPEF (Fig. 4B). The reclassification of HFpEF by incorporating the confirmed and suspected HFpEF categories into a single “possible HFpEF” category is shown in Table S1. The concordance between the 2021 ESC and HFA-PEFF scores, 2021 ESC and H2FPEF scores, and HFA-PEFF and H2FPEF scores was moderate with Cohen’s unweighted kappa value of 0.59, 0.51, and 0.46, respectively.

Discussion

This study applied and compared three current algorithms for diagnosing HFpEF in patients with unexplained dyspnea. We found that the prevalence of HFpEF significantly varies depending on the diagnostic algorithm applied, with the prevalence of confirmed HFpEF ranging from 28.5% (2021 ESC guideline) to 21.1% (HFA-PEFF) and 4.7% (H2FPEF). While 40.2% of the patients were classified into different likelihood categories when comparing the HFA-PEFF and H2FPEF scores, only 3.5% of patients had a consistent confirmed HFpEF according to the three algorithms. Second, each algorithm identified different HFpEF subgroups, characterized by differing comorbidities and echocardiographic parameters related to diastolic dysfunction.

Notably, recent studies demonstrated that sodium-glucose cotransporter 2 inhibitors reduce cardiovascular events in patients with HFpEF25,26. Therefore, early and accurate diagnosis of HFpEF may help reduce its medical and economic burden by allowing the timely initiation of proper medications. However, diagnosing HFpEF remains challenging. Some patients are overdiagnosed with HFpEF, as various conditions share similar symptoms, mainly in patients with many comorbidities11. However, in other cases, HFpEF may be missed27. Moreover, no clear consensus exists on how HFpEF should be defined, and invasive or exercise-based hemodynamic assessments are difficult to use as routine evaluations in real clinical practice7,11. Thus, some score-based algorithms, such as HFpEF, HFA-PEFF, and H2FPEF scores, have been proposed to aid in HFpEF diagnosis. These tools represent a shift from the classical binary diagnostic approach to one based on the likelihood estimation of HFpEF.

However, significant discrepancies have been reported in the classification of patients with suspected HFpEF when using the two novel scores and the 2021 ESC HF guidelines23,24. In our study, the prevalence and characteristics of confirmed HFpEF were significantly different across the three diagnostic algorithms, ranging from 4.7 to 28.5%. Although the overall proportion of confirmed HFpEF among patients with unexplained dyspnea was similar to a previous study, there were slight differences regarding which scores had the highest prevalence of HFpEF24. Additionally, when using either the HFA-PEFF or H2FPEF scores, 40.2% of patients were reclassified into different likelihood categories of HFpEF depending on the score used. Additionally, the overlap of confirmed HFpEF cases across the three different algorithms was minimal, at 3.5%. This highlights how different diagnostic scores can result in different patient groups being classified as having HFpEF, being excluded from HFpEF, or being referred for further testing. This variability limits the clinical applicability of these diagnostic algorithms and reflects the continuing diagnostic uncertainty surrounding HFpEF.

The composition and elements of each diagnostic algorithm differ significantly. The H2FPEF score includes four clinical comorbidities and two echocardiographic items, whereas the HFA-PEFF score and 2021 ESC guideline focus more on echocardiographic structural and functional parameters, along with NP levels. Thus, AF, hypertension, and the number of antihypertensive medications emerged as the main factors driving the discrepancy between the algorithms, as they were key items in the H2FPEF score. In contrast, AF tends to raise the thresholds for HFpEF in the HFA-PEFF score and 2021 ESC HF guideline. In addition, patients with confirmed HFpEF according to the H2FPEF score had the highest NP levels and more advanced echocardiographic markers of diastolic dysfunction, presumably because the H2FPEF score applied the most stringent criteria. Additionally, the HFA-PEFF score and 2021 ESC HF guideline revealed relatively high agreement and concordance in our analysis, possibly due to an overlap in the items containing echocardiographic parameters.

Currently, the two newer diagnostic scores for HFpEF are not widely adopted in real-world clinical practice due to the large number of variables they involve, and they are not the primary recommendation in the 2021 ESC HF guideline due to their complexity8. Additionally, both scores have the limitation of classifying a large proportion of patients as intermediate likelihood, requiring further diastolic stress testing. Aizpurua et al. reported that 36% of patients with known HFpEF fell into the intermediate likelihood category according to the HFA-PEFF score, implying that over 50% of patients with unexplained dyspnea would have to undergo further testing before HFpEF could be diagnosed or excluded18. To date, HFpEF diagnosis is challenging and fraught with obstacles and inconsistencies. Future studies will help improve these diagnostic algorithms to increase their accuracy and facilitate their use in real-world clinical practice.

Although patients with HFpEF in Western countries are predominantly obese or overweight, obesity is less common in Asian patients28,29. Particularly, the DELIVER study, which evaluated the effect of dapagliflozin in HF with mildly reduced or preserved ejection fraction, reported that Asian and non-Asian patients had mean BMIs of 25.1 and 31.0 kg/m2, respectively30. In our study, the mean BMI of patients with confirmed HFpEF according to the 2021 ESC HF guideline, HFA-PEFF or H2FPEF scores was approximately 25 kg/m2, which is consistent with previous data. Although the H2FPEF score uses a BMI cut-off of 30 kg/m2 14, the cohort from which this score was developed is primarily included patients from North America. Although previous studies have evaluated the diagnostic performance and validity of the HFA-PEFF and H2FPEF scores in Asian populations20,21,22, the H2FPEF score may not accurately reflect the characteristics of Asian HFpEF patients with relatively low BMI. Here, the use of different BMI cut-offs slightly changed HFpEF prevalence but did not significantly alter the concordance between HFpEF diagnostic algorithms. Further studies on HFpEF and BMI cut-off values in Asian populations are warranted.

Limitation

This study had some limitations. First, as this study was a retrospective analysis of a single-center cohort composed almost entirely of Korean individuals, the generalizability of our findings is limited. Additionally, the patients in this study were limited to those who underwent echocardiography, cardiopulmonary exercise testing (CPET), and NP measurements, which limits the interpretation and significance of our findings. Nevertheless, by including patients who underwent CPET, we believe our analysis more accurately represented patients with suggestive HFpEF. Second, we lacked echocardiographic data, including LV global longitudinal strain (GLS), or lateral e’, which are components of the HFA-PEFF score. Therefore, we could not use these variables in our calculations, possibly resulting in a lower HFpEF prevalence. However, in an HFA-PEFF validation study, the inclusion of global longitudinal strain as a criterion resulted in only a minimal reclassification of individuals18. Third, 79 patients (8.0%) had missing data for TR velocity and PASP, and these patients were assumed to have a normal value for TR velocity and PASP. This assumption may have resulted in an underestimation of HFpEF prevalence. Fourth, we were unable to present data on confirmatory tests for HFpEF, such as diastolic stress echocardiography or invasive catheterization, as these procedures were rarely performed at our institution. Consequently, we could not definitively confirm whether the patients truly had HFpEF. Instead, we estimated the likelihood using diagnostic scores and categorized the patients into three groups. Therefore, the diagnostic accuracy of the three algorithms in identifying true HFpEF could not be assessed, and some cases may have been misclassified. Fifth, in our analysis, patients were classified into confirmed HFpEF, suspected HFpEF, and no HFpEF, according to the 2021 ESC HF guideline. This classification does not strictly align with the 2021 ESC HF guideline, which states that an elevated NP is not mandatory for the diagnosis of HFpEF. However, for simplicity in analysis, patients with both an elevated NP and an echocardiographic marker of LV diastolic dysfunction or raised LV filling pressure were classified as having confirmed HFpEF in this study. Sixth, NP levels can be influenced by various comorbidities (e.g., anemia and impaired renal function), medications, and the timing of measurement in relation to treatment. As this study was not prospectively designed, NP could not be measured in a standardized manner. However, we included patients who had NT-proBNP, transthoracic echocardiography, and CPET data available within a 3-month interval. In addition, we were unable to comprehensively assess or categorize the nature and severity of patients’ HF symptoms. Therefore, we could not analyze potential differences in HF symptom profiles across the various HFpEF classification criteria, which represents another limitation of the study.

Despite these limitations, this study has certain strengths, including the large number of participants suspected of having HFpEF with available data on echocardiography and NP levels. Therefore, our study provides important insights into the concordance of three diagnostic algorithms for HFpEF in patients with unexplained dyspnea.

Conclusions

Our study provides new data on the application of the 2021 ESC HF guidelines, the HFA-PEFF score, and the H2FPEF score for diagnosing HFpEF. The prevalence of HFpEF in patients with unexplained dyspnea largely varied significantly depending on the algorithm applied. The differences in HFpEF definitions across the guideline and various scoring systems have led to considerable differences in the stratification of HFpEF.

Methods

Study population

We retrospectively analyzed 2005 patients with one or more symptoms of HF who underwent echocardiography, CPET, and NT-proBNP measurement between January 2012 and April 2021 at a single tertiary center (Fig. 1). Symptoms of HF included dyspnea upon exertion, paroxysmal nocturnal dyspnea, or orthopnea. We excluded patients with a left ventricular ejection fraction (LVEF) < 50% or an apparent cause of HF symptoms other than HFpEF, such as significant valvular heart disease, primary cardiomyopathy, pericardial disease, pulmonary artery hypertension, pulmonary thromboembolism, previous myocardial infarction, congenital heart disease, high-output HF, bradyarrhythmias requiring a pacemaker, or evident pulmonary disease31. Significant valvular heart disease was defined as valvular stenosis greater than mild or valvular regurgitation greater than moderate. Finally, 992 participants with unexplained dyspnea suspected of having HFpEF were included in this study.

This study was approved by the Institutional Review Board of the Yonsei University Health System (4-2021-0605) and was conducted in accordance with the tenets of the Declaration of Helsinki. The requirement for informed consent was waived by the Institutional Review Board of the Yonsei University Health System because of the retrospective nature of the study, and personal identifying information was removed after cohort creation.

2021 ESC HF guideline and HFA-PEFF and H2FPEF scores

The HFA-PEFF score contains minor and major criteria across three domains: functional (septal or lateral e′ velocity, E/e′ ratio, TR velocity or PASP, and LV GLS, morphological (rhythm-specific LAVI, sex-specific measures of LV mass index (LVMI), LV wall thickness, and relative wall thickness) and biomarker (rhythm-specific NP) domains15. Within each domain, a major criterion scores 2 points and a minor criterion scores 1 point. Each domain can contribute a maximum of 2 points if any major criterion is met, or 1 point if no major criterion but any minor criterion is met. Due to a lack of echocardiographic data on lateral e’ and GLS, these parameters were not included in the calculation of the HFA-PEFF score for the current study.

The H2FPEF score is the sum of four clinical variables—BMI > 30 kg/m2 [2 points], treatment with ≥ 2 antihypertensive medicines [1 point], AF [3 points], and age > 60 years [1 point])—and two echocardiographic items: E/e′ ratio > 9 [1 point] and PASP > 35 mmHg [1 point]14.

The HFA-PEFF and H2FPEF scores estimate the likelihood of HFpEF and categorize patients suspected of having HFpEF into low, intermediate, or high likelihood of HFpEF14,15,31. Low scores (HFA-PEFF or H2FPEF ≤ 1) are considered to exclude HFpEF. Intermediate scores (H2FPEF score of 2–5 or HFA-PEFF score of 2–4) indicate an intermediate likelihood of HFpEF, wherein case further diastolic stress testing is needed to confirm HFpEF. High scores (H2FPEF ≥ 6 or HFA-PEFF ≥ 5) are considered highly suggestive of HFpEF or diagnosis of HFpEF. Here, patients with high, intermediate, or low scores according to HFA-PEFF or H2FPEF scores were classified as having confirmed HFpEF, suspected HFpEF, or no HFpEF, respectively, for ease of analysis.

According to the 2021 ESC HF guideline8, the diagnosis of HFpEF requires: (1) symptoms and signs of HF, (2) an LVEF ≥ 50%, and (3) objective evidence of cardiac structural or functional abnormalities consistent with the presence of LV diastolic dysfunction or raised LV filling pressures, including raised NPs. For simplicity in analysis, patients with symptoms of HF and LVEF ≥ 50%, along with both elevated NT-proBNP levels and echocardiographic markers of diastolic dysfunction, were classified as having confirmed HFpEF in this study. The NT-proBNP level and echocardiographic parameter thresholds specified in the guideline were applied8. Patients meeting only one of these criteria (NT-proBNP levels or echocardiographic parameters) were classified as having suspected HFpEF, while those meeting neither were classified as not having HFpEF.

Echocardiographic parameters

All echocardiographic studies were performed using a standard ultrasound machine with a 2.5-MHz probe. Standard images were obtained in the parasternal and apical views using two-dimensional Doppler and M-mode imaging. Echocardiographic parameters were measured according to the recommendations of the American Society of Echocardiography and the European Association of Cardiovascular Imaging32,33.

LVEF was measured using the modified Quinones method or calculated using the modified Simpson’s method, employing apical four- and two-chamber views in patients with regional wall motion abnormalities34,35. LVMI was calculated based on left ventricular diameter and body surface area. The LAVI was measured using the prolate ellipsoid method or biplane-modified Simpson method and adjusted for body surface area. Mitral inflow velocities were recorded using pulsed-waves Doppler from the apical window, with a 1- to 3-mm sample volume placed between the tips and mitral leaflets during diastole. The following variables were obtained: peak velocity of early (E) and late (A) diastolic filling. Mitral annular velocity was measured by tissue Doppler imaging using the pulsed-wave Doppler mode. Early diastolic (e’) and systolic (S’) velocities of the mitral annulus were measured from the apical four-chamber view, with a 2- to 5-mm sample volume placed at the septal corner of the mitral annulus. The ratio of the early diastolic mitral inflow velocity to the early diastolic mitral annular velocity (E/e’) was calculated. TR jet velocity was also obtained, if measurable, to estimate PASP using continuous wave Doppler. PASP was calculated using the modified Bernoulli equation as 4 × (peak TR velocity)2 plus the estimated right atrial pressure. Right atrial pressure was estimated by measuring the diameter of the inferior vena cava and its response to inspiration. Among the 992 patients, 79 (8.0%) had missing data for TR velocity and PASP and were presumed to have PASP within the normal range.

Data collection

We retrospectively collected data on participants’ baseline characteristics, including demographic data, medical history, current medications, laboratory data, and echocardiographic parameters through a review of medical records. Hypertension was defined as a blood pressure of ≥ 140/90 mmHg or use of antihypertensive medication. The estimated glomerular filtration rate (eGFR) was calculated using the Chronic Kidney Disease Epidemiology Collaborative equation 36. Chronic kidney disease was defined as an eGFR < 60 mL/min/1.73 m2. Serum NT-proBNP was measured by an electrochemiluminescence immunoassay using Cobas e601 analyzer (Roche, Basel, Switzerland).

The World Health Organization defines overweight as a BMI ≥ 25 kg/m2 and obesity as a BMI ≥ 30 kg/m237. However, Asian populations including Koreans, have an increased risk of diabetes and cardiovascular disease even at lower BMI levels; therefore, different criteria for obesity are used for Asians, with as BMI of ≥ 25 kg/m2 being classified as obesity38,39. When calculating the H2FPEF score, we used BMI criteria of > 30 kg/m2 for the main analysis, consistent with a previous study14, and also performed supplementary analyses using BMI cut-off of 27–25 kg/m2. We evaluated the change in the H2FPEF score and concordance between HFpEF diagnostic algorithms according to these BMI cut-off changes.

Statistical analyses

Categorical variables are reported as frequencies (percentages) and were compared using Pearson’s chi-squared or Fisher’s exact tests. Continuous variables were tested for normality using the Shapiro–Wilk normality test. Continuous variables with normal distribution are expressed as means ± standard deviation and were compared using Student’s t-test. Non-normally distributed continuous variables are presented as medians with interquartile ranges and were compared using the Mann–Whitney U test. One-way analysis of variance or the Kruskal-Wallis test was used to compare continuous variables across the three groups.

We used Cohen’s kappa coefficients to assess concordance among the three classifications40. Concordance was defined as poor (0–0.20), fair (0.21–0.40), moderate (0.41–0.60), good (0.61–0.80), and optimal (0.81–1.00). When patients were categorized into three groups–confirmed HFpEF, suspected HFpEF, and no HFpEF–Cohen’s weighted kappa statistic was used. For the categorization of patients into two groups, HFpEF and non-HFpEF, Cohen’s unweighted kappa statistic was used. All tests were two-tailed, and P value < 0.05 was considered statistically significant.