Introduction

Heart failure refers to a clinical syndrome characterized by impaired cardiac pumping and/or filling capacity, or alternatively, as an abnormality in cardiac structure/function resulting in inadequate cardiac output or compensatory neurohormonal activation leading to increased left ventricular filling pressure1. Previous evidence has reported that heart failure affects more than 64 million people globally2. Despite significant advancements in the management of heart failure over the past few decades, cardiovascular diseases, including heart failure, continue to be the leading cause of mortality worldwide3. The reported mortality rate of heart failure following hospitalization is 10.4% in 30 days, 22% in 1 year, and 42.3% in 5 years4. Currently, there are no pragmatic strategies to screen heart failure. Tests such as echocardiography and radionuclide ventriculography provide valuable insights into the patient’s cardiac condition, but they are impractical for use as screening methods due to concerns of cost and availability5. While the measurement of N-terminal fragment of pro-brain natriuretic peptide (NT-proBNP)/ brain natriuretic peptide (BNP) levels in plasma can serve as a promising biomarker for the detection of heart failure, it may be influenced by various factors including age, gender, body composition, left ventricular hypertrophy, tachycardia, right ventricular overload, hypoxemia, and renal function6. Prediction and early detection of heart failure aid the determination of personalized medical and surgical interventions for patients, which is crucial to mitigate its impact on quality of life, survival rates, and healthcare expenditures7.

Traditionally, electrocardiogram (ECG) or phonocardiogram (PCG) signals have been utilized in the screening of cardiac abnormalities for decades8. However, screening for heart failure can be challenging due to the low diagnostic yield of a single ECG or single PCG to detect an often-fleeting arrhythmia9. A recent study conducted by Chakir et al. has demonstrated that the integration of ECG and PCG leads results in superior performance for assessing heart conditions compared to relying solely on PCG recordings10. Recent advances in artificial intelligence (AI) to ECG analysis have demonstrated that it helps increase the accuracy of heart failure detection and supply personalized treatment for individual patients11. Lately, a novel technology that can synchronously detect the ECG and PCG called AI-phono-electrocardiogram (AI-PECG) is applied in several institutions. Whether AI-PECG can improve the detection accuracy of heart failure still requires exploration. In addition, the hemodynamic parameters, which could be collected from non-invasive radial artery blood pressure monitoring device system, was reported to be a useful surrogate indicator for cardiovascular function markers12. We hypothesized that the combination of AI-PECG and radial artery indicators might help improve the screening accuracy of heart failure. In our study, we planned to construct a novel scoring system of screening for heart failure based on indicators associated with AI-PECG and radial artery using the data.

Methods

Study design and population

In this cross-sectional study, participants were patients at first medical contact at the First People’s Hospital of Shangqiu city from February 1 to December 30, 2023. The inclusion criteria were (1) aged ≥ 18 years old, (2) AI-PECG and radial hemodynamics were monitored at the first visit, (3) without previous history of heart failure, (4) with intact data on key parameters (ECG, heart sound, radial artery), (5) with a clear diagnosis. Patients aged < 18 years of age, had a clear history of heart failure (confirmed by electronic medical record), and missed key information such as ECG, heart sound, or radial artery parameters were excluded. The patients’ demographic characteristics were selected from electronic health records. The features of cardiac electronical activities and mechanism activities were extracted from the AI-PECG system. Hemodynamic indicators were selected from non-invasive radial artery blood pressure monitoring device systems. The patients’ demographic characteristics and the features of cardiac electronical activities, mechanism activities, and radial artery were combined by the unique code of electronic health records.

Primary outcome

In this study, the endpoint of interest was diagnosis of heart failure. The diagnosis of heart failure was confirmed based on the patients’ electronic medical records. Since this study is retrospective, diagnosis information relied on the final diagnosis of heart failure recorded by clinicians during their treatment. The study endpoint (diagnosis of heart failure) assessment was determined by retrospective analysis of electronic medical records between February 1, 2023, and December 30, 2023. Patients who present for the first time during this period and undergo ECG, cardiogram, and radial artery characteristics at the initial medical contact will be further scheduled for echocardiography and blood biomarker testing if the initial screening results are abnormal. During the study review period, all patients diagnosed with heart failure by a clinician based on follow-up gold standard tests were considered positive events for this study.

Cohort define and features selection

All the samples were randomly divided into the training set and testing set with a ratio of 7:3 using the R function “createData Partition” to ensure that outcome events were distributed randomly between the two data sets (Supplementary Table 1). Data in the training set were used for features selection and scoring system construction. The testing set was used to validate the efficiency of constructed scoring system on heart failure diagnosis. Totally, 28 features (3 demographic features, 11 cardiac electronical activities and mechanism activities features, and 14 radial artery features) were included: age, gender, body mass index(BMI), electromechanical activation time (EMAT), left ventricular systolic time (LVST), mitral valve closure-tricuspid valve closure (M1-T1), aortic valve closure-pulmonic valve closure (A2-P2), third heart sound (S3), fourth heart sound (S4), Axes, P-wave Duration(PD), PR Interval Duration (PRD), QRS duration (QRSD), QT Duration (QTD), total voltage of R wave of V5 + the S wave of V1 (RV5 + SV1), systolic blood pressure (SBP), diastolic blood pressure (DBP), mean arterial pressure (MAP), pulse/heart rate, continuous Cardiac Output (CCO), continuous Cardiac Index (CCI), stroke volume (SV), stroke volume index (SVI), systemic vascular resistance (SVR), systemic vascular resistance index (SVRI), stroke volume variation (SVV), pulse pressure variation (PPV), cardiac power output (CPO), cardiac power index (CPI), and maximum rate of left ventricular pressure rise (dP/dt-max). Continuous features were converted into binary features using the Youden Index of the univariate logistic model (Supplementary Fig. 1 and Supplementary Table 2), and with Type I error set to 0.05, variables with P < 0.05 in both univariate and multivariate logistic regression analyses for all 28 features were identified as independent risk factors.

Statistical analysis

The continuous features of normal distribution were described as Mean and standard deviation (Mean ± SD), and t test was used for comparison between groups. Non-normally distributed features were described as median and quartiles [M (Q₁, Q₃)], and Mann–Whitney test were used for inter-group comparisons. The categorization features were described in terms of number and percentages of cases [n (%)], and Chi-square test was conducted to compare the difference between groups. The stepwise regression was used to select variables from significant variables in univariate logistic regression for the final score system. The stepwise regression coefficients were assigned to each feature using the round number of the regression coefficient, that is, take the value closer to 0.5, 1, 1.5, 2, 2.5 as the score, each score was applied for the feature and the total risk score was established by summing the scores. The process of scoring system construction was exhibited in Fig. 1. All statistical analysis was conducted using R version 4.3.1.

Fig. 1
figure 1

The process of the construction of the scoring system.

Results

Characteristics of participants

A total of 813 participants with a mean age of 70 years were included in this study, of which 409 (50.31%) were male. 127 individuals were diagnosed with heart failure, and their median EMAT was higher than that of those without heart failure (13.00 vs. 11.00, P < 0.001). The median QRSD in heart failure individuals was higher than that of the non-heart failure individuals (90.00 vs. 80.00, P < 0.001). Additionally, PCG parameters including M1-T1, A2-P2, and S4, ECG parameters including RV5 + SV1, and hemodynamic parameters including CPO were all significantly different between these two groups (all P < 0.05) (Table 1).

Table 1 Patients Basic characteristics.

Risk score features screening

According to the stepwise regression results, the score features containing EMAT, M1_T1, A2_P2, S4, RV5 + SV1, SBP, SVV, CPI, and dP/dt-max in the training cohort. In the univariate regression analysis, 13 features (EMAT, M1_T1, A2-P2, S4, QRSD, RV5 + SV1, SBP, HR, SVI, SVV, CPO, CPI, and dP/dt-max) were significantly associated with heart failure. After including them into the multivariable logistic regression model, EMAT, M1_T1, A2_P2, S4, RV5 + SV1, SBP, SVV, CPI, dP/dt-max were identified as independent prognostic factors for heart failure (Table2).

Table 2 The score of the features.

Risk score construction and validation

In the training set, the AUC of the risk score for predicting heart failure was 0.802 (95%CI: 0.744–0.859). The sensitivity was 0.892 (95%CI: 0.864–0.919), with PPV at 0.924 (95%CI: 0.900–0.948). In the testing set, the AUC of the risk score was 0.762 (95%CI: 0.669–0.855), with a sensitivity of 0.917 (95%CI: 0.880–0.955), and a PPV of 0.913 (95%CI: 0.875–0.951) (Table 3 and Fig. 2). After stratified by gender, age, and hypertension, the OR and AUC of scores were all higher in patients with heart failure than those without heart failure, but the situation was opposite among patients with coronary heart disease (Fig. 3). The risk score was further developed into an online tool via DynNom, which is freely accessible online.

Table 3 The predictive value of the scoring system for heart failure.
Fig. 2
figure 2

The ROC curve of the scoring system for the screening of heart failure.

Fig. 3
figure 3

The association between score and risk of heart failure stratified by gender, age, hypertension and coronary heart disease.

The association between the risk score and the risk of heart failure

Participants with heart failure had a higher score compared to those without. The risk of heart failure was non-linearly increased as the elevation of the score and the OR of heart failure increased rapidly when the score was > 3.5 points (Fig. 4). The risk of heart failure elevated gradually with the increase of score, but it notably escalated in individuals with scores exceeding 3–4 points (Fig. 5). In the crude model, a higher score appeared to be a risk factor for heart failure both in the training set (OR = 4.003, 95%CI: 2.935–5.458, P < 0.001) and the testing set (OR = 3.262, 95%CI: 2.080–5.115 P < 0.001). After adjusting for confounding factors, including age, gender, BMI, coronary heart disease, and hypertension, the associations between the increased risk of heart failure and the elevation of the score remained significant in both the training set (OR = 4.461, 95%CI: 3.209–6.200 P < 0.001) and testing set (OR = 3.854, 95%CI: 2.364–6.281 P < 0.001) (Table 4).

Fig. 4
figure 4

The correlation between score and heart failure.

Fig. 5
figure 5

The association between the score and the risk of heart failure.

Table 4 Association between the score and risk of heart failure.

Discussion

The present study established a risk score based on the features of ECG, PCG, and radial artery. Our findings demonstrated that the risk score, which included parameters such as EMAT, M1-T1, A2-P2, S4, RV5 + SV1, and SBP, was significantly associated with the risk of heart failure. This risk score showed good predictive value for heart failure in both the total population and different subgroups. The findings of our study indicated that risk score can serve as a valuable tool for the early identification of patients at risk for heart failure. By facilitating the timely implementation of interventions, this tool has the potential to significantly enhance patient prognosis. We also developed a scoring system via DynNom, which is freely accessible online (https://reliablemedtec.shinyapps.io/reliablemedtec/).

In this study, we found that features from ECG, PCG, and radial artery, including RV5 + SV1, EMAT, M1-T1, A2-P2, S4, and SBP, had good performance for detecting heart failure, which was consistent with previous studies13,14,15,16,17. Notably, RV5 + SV1 has been identified as a significant independent predictor for the development of heart failure18, with a particularly strong association observed in patients with atrial fibrillation19. Furthermore, EMAT, as detected by PCG, performs better than LVEF or QRS duration in predicting heart failure20. The metrics M1-T1 and A2-P2, corresponding to the S1 and S2 heart sounds, respectively, have also been validated in prior studies as effective indicators for heart failure detection, further supporting our results21. Additionally, the fourth heart sound was linked to the performance of left atrial function in the study conducted by Ochi et al., which suggested its relevance in heart failure assessment22. Hemodynamic indexes, including SBP, have been identified as valuable and reproducible peripheral criteria of LV systolic performance in heart failure23. Prior studies have shown that there is a continuous positive association between SBP and HF risk in the elderly for levels of SBP as low as < 115 mmHg24. These reports were allied with the results of our study, suggesting the biomarkers associated with hemodynamic indexes might be useful for heart failure screening. Our results support the integration of ECG, PCG, and hemodynamic features into heart failure screening protocols, potentially enhancing early diagnosis and management strategies for at-risk populations.

Machine learning methods have been widely used in previous studies to detect heart failure25. By applying machine learning techniques to unique datasets obtained from electrocardiography, vector cardiography, echocardiography, and electronic health records, there is significant potential to enhance the detection and treatment of heart failure26. Machine learning methods were also applied in the detection of cardiac abnormalities based on ECG and PCG signals, demonstrating outstanding performance10. However, clinicians appear to be hesitant to adopt the models in daily practice, possibly because the ECG and PCG signals were not detected synchronously, and the detection was complex27,28. AI-PECG is a novel technology that achieves ECG and PCG synchronous acquisition and analysis, providing more comprehensive information and more accurate disease screening. In our study, we developed a novel scoring system with a good performance in the validation cohort based on data from AI-PECG and a non-invasive radial artery blood pressure monitoring device, that may be a useful method of screening patients with heart failure. Clinicians calculated the score based on the included features and obtained the risk of heart failure in participants. In previous studies developed based on neural network models that focused on detecting pervasive systolic dysfunction or heart failure, our study proposes the role of AI-ECG models in predicting heart failure risk. Previous studies have shown that positive AI-ECG screening for LVSD is associated with future LVSD risk in those who retain ejection fraction at baseline. However, these studies included a highly selected group of patients with clinical indications for continuous echocardiography and were evaluated predictively in the same sample from which the model was developed29,30. Also, although serum-tested biomarkers, such as NT-proBNP, are independently associated with a new risk of heart failure, their application is limited by the need for blood drawing and laboratory testing and is not suitable for early screening of large populations, especially in developing countries31. In contrast, our study included all adults without any prior left ventricular dysfunction or HF, representing a less selective group. Besides, this scoring system is built as an existing screening tool and can be integrated into the clinical routine of general practitioners, eliminating the need for additional time-consuming input of data. This convenient and cheap tool has the potential to be widely used at the grassroots level and could help to alleviate the under diagnosis of heart failure in clinic provide reference for timely interventions in patients with heart failure.

In our current study, we also observed good performance of risk score across different subgroups, despite some minor differences. There is a reduction in the AUC of the testing set in females compared to males. This finding is consistent with a study of Haukilahti et al., which indicated that the risk biomarkers for heart failure based on ECG might exhibit gender-specific differences between women and men in women and men32. It is worth noting that the OR value and AUC of the score were lower in participants with coronary heart disease than in those without coronary heart disease. This could be due to the similar characteristics in ECG, PCG and hemodynamic parameters in patients with coronary heart disease and patients with heart failure, which interferes with the detection of heart failure. In the future, a separate scoring system might be developed for individuals with coronary heart disease to improve the accuracy of heart failure screening.

The strength of this study was that this was the first study constructing a risk score system based on features of ECG, PCG and hemodynamic parameters. The risk score presented good predictive value for heart failure, providing a convenient tool. There were several limitations in our study. Firstly, the data were from single center, and although the prediction value of the scoring system was verified in the testing set, the external validation was not performed. Secondly, we can’t get the relevant parameter information of echocardiography and blood test data due to the data limitation, and that future research is suggested to verify the performance of the model with multi-modal data. Thirdly, AI-PECG was not widely used in institutions, and individuals without AI-PECG tech- knowledge may not be able to use the risk score system. Caution should be exercised when using the scoring system, and further well-designed studies are needed to verify its performance.

Conclusions

In our study, the risk score including features of RV5 + SV1, EMAT, M1-T1, A2-P2, S4, and SBP for heart failure screening was developed using data from AI-PECG and radial artery, which was significantly associated with risk of heart failure. We also developed a scoring system that showed good performance for heart failure detection, which might offer a novel tool for quick identification of patients with heart failure.