Introduction

More citizens die off each year from cardiac ailments than from any other reason. It has been approximated that by 2040, roughly 27 mil citizens will perish from heart complications1. Sudden cardiac death (SCD), primarily triggered by severe myocardial infarction (AMI)-induced cancerous ventricular arrhythmias (VAs) containing sustained ventricular tachycardia (VT) and ventricular fibrillation (VF), is still a main unsolved public health challenge2. Besides that, while atrial fibrillation (AF) is linked with heart failure, the associations among management of sinus rhythm (SR) as well as clinical consequences in Heart Failure with Preserved Ejection Fraction (HFpEF) are unidentified, which can be examined concerning the anomaly of heart arrhythmia3. Several high-end smart wearable therapeutic devices have lately been designed, enabling end users to track, record, and evaluate the electrocardiogram (ECG) with artificial intelligence (AI) systems and informing of abnormal heart rhythms or perhaps arrhythmias, a perspective predicament well noted in many sudden cardiac death (SCD) scenarios4. In healthcare practice, AEDs with the minimum risk of impacting cardiac electrophysiology are recommended in individuals showing seizures or arrhythmias. Common AEDs must be prevented, considering their arrhythmogenic characteristics and enzyme-inducing impact, which may make the correspondent remedies with AADs5. However, arrhythmias might need to be seen during the ECG signal-capturing so that classification can be erroneous. Hence, efficient pre-processing needs improvement, i.e., denoising ECG signals and classification accuracy.

Now a days, one promising mechanism is self-attention-based systems. Self-attention-based systems are advanced deep learning mechanisms that incorporate and characterize the effects of varying elements6. For example with sleep disorders patient, it is necessary to analyze the patient’s sleep pattern, which can be done automatically by developing an advanced deep learning mechanism by comparing the time series signals. Hence, self-attention artificial intelligence is more significant in analyzing patients’ ECG signals. Also, through reinforcement learning7, IoT devices can be used for self-attention system security.

Recently, attention processes have been gaining traction due to their ability to select important data as input based on significant weights. In this context, self-attention has proved as a powerful tool, demonstrating its potential to predict a wide range of functions. By establishing predictive relationships among targets from the input data, self-attention can provide detailed insights into specific objects. This capability is a testament to the versatility and potential of self-attention systems in various applications.

The self-attention system has recently gained popularity due to its versatility and wide range of applications. It has proven to be effective in many functions, including speech attention8, speech emotion attention9, phoneme identification10, and neural machine translation11. This diverse range of applications underscores the adaptability and potential of self-attention systems, assuring it a valuable asset in medical research and beyond.

Further, deep learning classification methods may be effective for pre-processing echocardiography imaging into particular perspectives and image orientations, such as identifying specific heart structures or observing the heart from many projections, to maximize automation and the time effectiveness of the humane interpreter.

The most recent execution is image segmentation12, which is necessary for pre-processing to be fed to the attention mechanism. Similarly, the effect of arrhythmia can be assessed by denoising (at the pre-processing stage) ECG signals, followed by classifying a single heartbeat for QRS-complex and R-R peak identification. With artificial intelligence and machine learning computational models, detection and classification can be achieved with higher accuracy.

Irrespective of the many types of applications, several considerations need to be understand before the use of AI in healthcare. First, ethical considerations have been raised about using artificial intelligence in the care and attention of the aging population/ patients. It is primarily comprehended that the AI exercise is symbolized by two independent divisions: the digital and the physical. The digital side is most effectively described using mathematical methods13. Artificial intelligence-powered medical systems are speedily growing into relevant alternatives for clinical practice. Early diagnosis of atrial fibrillation was among the primary approaches used in AI. AliveCor confirmed FDA approval in 2014 for their cellular program Kardia, enabling smartphone-based ECG monitoring and detection of atrial fibrillation14. Hence, a Self-attention AE system is necessary for system automation and seamless analysis.

Self-attention AE is greatly evolving our strategy for patient administration in the various areas of cardiovascular diseases. Furthermore, self-attention AE can access digital medical information, screen sufferers, and employ natural language processing to recognize individuals with particular phenotypes, such as those with a high risk of developing cardiovascular diseases or those who may benefit from a specific treatment. Hence study and research protocols need to implement to expose the power of self attention mechanism15.

The leading healthcare sector has long identified the opportunity of machine learning, specifically in the early phases of heart disease: machine learning can help determine effective patient decisions16. One-dimensional self-attention CNN with wavelet transform (WT)17 is a good option, where pre-processing and classification can be faster with the proposed auto-encoder system. Numerous challenges in the domain of cardiac arrhythmias, which were ignored in previous studies, are given in the next section.

Challenges

Core challenges in cardiac arrhythmia pre-processing and classification are summarized here:

  1. 1.

    Cardiac arrhythmia symptoms are not easily recognizable using manually ECG signal analysis and observation methods. Hence, automation is needed for the ECG signal analysis process by denoising the ECG signals. Also, it is necessary to improve the Kalman filter performance to lower the ECG signal power line interference.

  2. 2.

    The properties of ECG signals, which depict the heartbeat rhythm. So, to classify cardiac arrhythmia conditions, common element analysis, like QRS-complex and R-R interval analysis of ECG waveform with self-attention mechanism, is needed.

  3. 3.

    The morphology of ECG signals is different for any physical activity. So, ECG signal morphology keeps varying with every heartbeat cycle due to different activities like walking, running, working on some task, etc. Hence, it becomes more difficult to segregate cumulative heartbeat analysis, and as such, cumulative ECG signal analysis may lead to the false prediction of cardiac arrhythmia. Subsequently, it is necessary to classify the ECG signals as a single heartbeat classification at steady state of QRS-complex and R-R interval.

  4. 4.

    On the technological front, the paper roll ECG report gives rough observations about heart functioning, including noise and artifacts. In the case of IoT healthcare products, power line interference (PLI) is a major problem that is widely noted. PLI changes the ECG signal morphology and adds noise to the original ECG waveform, which leads to erroneous ECG signal interpretation. Hence, it is necessary to denoise such PLI signals to get a clear ECG signal interpretation.

  5. 5.

    Researchers have developed numerous ECG signal analysis models since the decade; however, increasing the accuracy of classification and prediction of cardiac arrhythmia is necessary. The self-attention system is necessary to analyze the single heartbeat ECG Signal component analysis, which can automatically detect the state of the heartbeat during any patient activity. Wearable healthcare devices must be coordinated with ML/DL frameworks for fully automated cardiac arrhythmia analysis.

Research Contributions

While ECG analysis has seen numerous studies, integrating the self-attention mechanism and deep learning model development with ECG signal pre-processing is largely uncharted. This innovative approach is pivotal for precise ECG signal abnormality analysis. In the proposed work, we introduce a groundbreaking self-attention-based AE algorithm, a significant departure from current research, with three key contributions:

  • As a preprocessing step, We are implementing a modified Kalman Filter (MKF) to denoise ECG signal leads to clear visibility of waveform (P, Q, R, S, T) components.

  • Further to preprocessing, we are implementing a deep learning model with a proposed novel self-attention-AE algorithm.

  • And ultimately, we are classifying the ECG signals for cardiac arrhythmia prediction.

  • At last, comparing the performance of the proposed novel modified Kalman filter, self-attention-AE algorithm technique with existing models.

The proposed self-attention-AE algorithm will execute pre-processing by denoising ECG signals using a novel modified Kalman filter, which can provide more clarity to the QRS-complex and R-R peak cycle. Further, ECG signal classification is accomplished to identify cardiac arrhythmia. The classification result predicts the various arrhythmic conditions, which is helpful for early medical treatment to avoid possible health risks. Most importantly, medical experts can understand the state of the single heartbeat using the QRS complex and R-R peak analysis of the ECG waveform, which otherwise acts unpredictably due to various patient conditions. This potential for early detection and understanding cardiac conditions should inspire optimism in the medical community.

This paper is meticulously structured to understand the artificial intelligence application in the healthcare domain, particularly in the context of ECG signal analysis. Section 2 offers a thorough literature review; Section 3 presents the proposed methodology and mathematical model clearly and concisely. Section 4 provides detailed Results and Discussion, including a comparative analysis. Section 5 candidly discusses the study’s limitations and Section 6 offers insightful conclusions and directions for future research.

Related works

The sector of healthcare is harnessing the power of various AI, ML, DL algorithms for ECG analysis. This technological leap is not just about innovation but about saving lives. By enabling preventive health measures and facilitating early treatment decisions for patients with cardiac arrhythmia, these automated systems are proving to be a life-saving line of action, surpassing the traditional analysis approach.

The heart muscles require a separate electrical activation structure to produce the required compression and expansion for pumping the bloodstream to several parts of the human body. Recently, a couple of primary ways have been utilized for the automatic analysis of heart disorders dependent on ECG signal investigation. These include ’QRS complex classification, a process that involves identifying and categorizing the QRS complex, a key feature of the ECG signal, and the evaluation of extended fragments of the ECG signal18.

Furthermore, the Pan-Tomkins algorithm, a significant tool in ECG signal analysis, has been widely used to detect R-peaks in corrupted ECG signals19. This algorithm, known for its robustness and accuracy has significant role in identifying the peaks of the QRS complex, a key indicator of heart health. Similarly, adaptive filtering and template matching have enhanced anomaly detection performance. For denoising, continuous wavelet transform20 is applied by authors with a selective scale for detecting the R-peak points. In contrast, discrete wavelet transform was used in21 to denoise ECG signals.

Technological significance

ECG is a crucial non-invasive analysis approach for recognizing numerous heart conditions. AI-driven models enable electrophysiologists to rapidly determine the origins of arrhythmias and enhance mapping and ablation procedures. However, challenges remain regarding dataset size, generalizability, and validation. Careful attention to validation and translation, addressing potential biases, and fostering inter-institutional collaboration is crucial to ensure AI’s safe and effective deployment in cardiac electrophysiology22,23.

Machine learning, a subtype of AI, is significantly impacting healthcare. AI offers hope in regions grappling with overburdened healthcare systems and a shortage of skilled medical professionals. Its potential to analyze diverse data, uncover associations, provide interpretations, and identify patterns can significantly enhance classification models’ integrity, performance, predictability, and precision for various health conditions24,25,26.

The author utilized executed DL model training as well as validation on the MIT-BIH database, a widely used benchmark in the domain of cardiac electrophysiology, by applying ResNet 50 and AlexNet pre-trained models, which are, in addition, fine-tuned to accomplish optimum classification outcomes. The author considered F-measure, recall, precision, and accuracy for analyzing the model’s overall performance with a confusion matrix27. The author suggested an innovative LSTM-based framework model to boost the effectiveness of the classification of ECG signals. The suggested technique enables ECG signals to be precisely input into the model without requiring intricate pre-processing and analysis using the MIT-BIH arrhythmia database. The author accomplished a 98.57% of accuracy while executing the suggested method of an LSTM-based framework28.

The author demonstrated the promising potential of GAN and LSTM in heart disease prognosis. The GAN model, used to augment the data with more false data, significantly outperformed individual models when combined with LSTM in an ensemble model. This ensemble model achieved improved accuracy 0.992 and 0.897 as F1 score, providing reassurance about the potential of these models in cardiac electrophysiology.

The author addressed the challenge of overlapping R-peaks, low ECG intensity, organized and normal noises, and common signal extraction options. These options, such as adaptive filters29, impartial element analysis, and scientific decomposition, often fail to produce acceptable ECG. While some methods can generate appropriate QRS complexes, they often overlook other essential factors of the ECG. Based on the innovative 1D CycleGAN30, the proposed methodology can reconstruct the overlapping ECG impulses while preserving the expected morphology for comprehensive pre-processing31. This novel approach will pique the interest of researchers and professionals in the field.

Sparse coding, a powerful tool in signal processing, involves reconstructing an input using a short mix of signals. In [31], the author successfully employed the Iterative Shrinkage Algorithm with a Kalman filter32,33,34. This approach, executed as an unfolded deep CNN, is effective for any signal with a short manifestation. The author is assured with experiment result of noise elimination and target mobility which leads to effectiveness of method. Furthermore, the potential for upgrading the unfolded deep CNN with the Kalman filter equipped with K-fold for recurrent testing ads to the method’s robustness in evaluating the state of cardiac arrhythmia.

In35, authors have developed a strategy to assist e-healthcare platforms in cardiac signal compression for efficient data interaction. The principal component analysis (PCA) compression technique reduces data redundancy, enhancing data transfer. The authors applied the Savitzky Golay filter (S Golay)36,37,38 to pre-process the original signals. This filter, known for its ability to preserve the signal’s shape maintains the accuracy of the cardiac signal. The wavelet transform39 is then used for thresholding applications, further refining the details of the cardiac signal.

Self attention techniques

According to the most recent AI, ML, and DL evaluation, self-attention systems analysis is necessary to improve classification accuracy and prediction for better performance. To figure out the process of the self-attention approach, refer to the following Fig. 1. It shows the processing of an image generation by an adversarial network and also represents the transformations in cases where the self-attention module is positioned in distinctive points of the network. Here, to support the training procedure, the author proposed the spectral normalization and network that performs discrimination task. Author investigated the performance of the learning rate of the model. Now a days young people are more frequently prone for getting cardiac arrest and therefore there is need for developing an early detection system for analyzing the warning signals. AI based early detection system is an enhanced method for ECG analysis helps to identify sudden cardiac arrest in those with no previously known heart condition. Self-attention mechanism in deep learning is the method which has shown its potential in the past This technique helps to prioritize and analyze the important features dependencies in the ECG waveforms, enhancing performance of method and reducing the manual effort and time to interpreted such an error prone complex input. There are many evidences that proves the success of this approach like a study at Ceders-Sinai Medical Center where ECG patterns is used to predict the SCA risks. It is observed that AI algorithms could outperformed existing non AI based approaches. There are many neural network based techniques like CNN and RNN with extra ordinary self attention layers changed the era of experiment on arrthymias detection for high sensitivity and accuracy, proving the efficiency of those models on diverse datasets(Frontiers in Cardiovascular Medicine, Springer)40,41.

Fig. 1
figure 1

Representation of self-attention mechanism42.

The author (s) innovatively formulated a unique generative adversarial network (GAN)42 structured deep learning method called HeartNet, an innovative methodology dealing with the data deficit issue. A CNN enabled with a multi-head attention layer comprises the suggested approach. The primary concern of inadequate data labels is resolved by adversarial data functionality employing a GAN to obtain supplemental training samples. The author (s) carefully examined the recommended technique implementing the MIT-BIH dataset and then attained a remarkable accuracy of 99.67%43.

The author suggested an attention mechanism-enabled bi-directional long short-term memory (ABLSTM)44 model to learn and predict ECG based problems. The model extract the features of the long and short term signal data, and the capabilities of attention mechanisms by adaptive learning in localized features, efficiently extracting intricate features of ECG signals and carrying out diagnosis. The author accomplished a pre-processing for ECG impulses. The proved its efficiency with accuracy of 96.6%.

Another work based on self-attention is U-net network residual connections architecture where the network switches into the U-shaped symmetrical structure, enhancing the convolution element and features the skip connections technique. The stability of network is maintained through the judiciously use of cross entropy and dice loss44.

To conquer the drawback of missing the extensive analysis data required for pre-examination, the author(s) presented a Cardiac RT-NN model that combines a self-attention-based CNN with LSTM components to categorize numerous cardiac problems using tailored ECG structures effectively. The research for real-time diagnosis is confirmed by the author on Raspberry Pi 4 and proves higher efficiency45.

The author recommended a novel dual-stage segmentation approach combining a CNN and a post-processing approach. The preliminary step uses a CNN boosted by a gated self-attention system for segmenting principal heart components and two main vessels. The recommended approach shows the upgrades of 1.02, 1.04, and 1.41%in Dice Coefficient, Intersection over Union, and Hausdorff Distance for heart elements segmentation46.

The significance of signal quality and precise diagnosis47 has been highlighted by recent developments in ECG analysis48 for cardiac arrhythmia prediction47. Research has emphasized the importance of feature extraction49 and noise reduction50 methods in improving the identification of cardiac arrhythmias51. It has been shown that combining machine learning algorithms with ECG data may lead to more accurate and timely detection52, which will help with the early diagnosis and treatment of cardiac problems53,54. These developments highlight how crucial high-quality ECG analysis is becoming for clinical decision-making55,56,57.

Proposed self-attention-based autoencoder strategy for better arrhythmia classification is in line with the authors’58 emphasis on the benefits of integrating dynamic ECG and serum indicators for early AMI diagnosis59. Similarly, the authors of60 suggest an effective hybrid strategy for ECG denoising, which is essential for improving machine learning models’ performance in arrhythmia identification. Furthermore, basic knowledge that may support ECG-based models is provided by insights into cardiac illnesses and treatment strategies, such as those presented by61 and62,63. Recent studies also investigate the integration of multimodal diagnostic approaches for more thorough treatment of cardiac disease, in addition to improvements in ECG data processing64,65. Together with the developments in microscopy by65,66, these investigations provide a more comprehensive framework for incorporating cutting-edge diagnostic methods into the treatment of cardiac disease. Using these state-of-the-art techniques, medical practitioners may better comprehend heart issues67, allowing for earlier intervention and more successful treatment regimens. Combining imaging technologies, molecular diagnostics68, and ECG analysis offers a viable path toward transforming the diagnosis and treatment of cardiac disorders69, eventually leading to better patient outcomes.

Methodology

The proposed methodology is designed, developed, and executed sequentially. Initially, ECG Data Acquisitions are accomplished then leads to the pre-processing stage where many different task has been done like denoising to deplete the power line interference using a newly developed modified Kalman filter (MKF). The proposed methodology workflow is represented in Fig. 2.

Fig. 2
figure 2

Proposed Methodology Workflow.

Further, the denoised signal data is fed to classify cardiac arrhythmia using the proposed algorithm and predict cardiac arrhythmia if followed by the classification stage.

Proposed model architecture and learning

The proposed research aims to boost the model’s accuracy for predicting cardiovascular diseases. Hence, we applied pre-processed ECG time series signal data. Further training in ECG time series signal data in a sequential manner is necessary. In this data, sequential elements are interrelated depending on intricate semantics and syntax rules. So, we applied RNN to train trains the sequential data input into a specific sequential data output.

Further, the system is tested with the proposed autoencoder using the new Self-attention-AE algorithm. We extracted the ECG time series signal from the MIT-BIH-arrhythmia dataset for the proposed system testing and analysis, in which the pre-processed ECG signals of patients with cardiac dysfunctions were selected for random testing. The dataset is most popular dataset for ECG analysis, it is introduced by MIT and Beth Israel medical institution in 1980. The dataset contains an approximate 30 minutes of annotated ECG recordings of arrhythmia and non-arrhythmia patient for detection and classification. Expert cardiologists manually labeled beats for arrhythmia it includes timestamps, beat types, and rhythm changes. The Sampling rate is also good that is 360 Hz with 11-bit resolution over a 10 mV range, capturing fine details of the ECG waveform which is enough for ECG analysis with two-lead ECG recordings. The 48 recordings from 47 different subjects with and without arrhythmia issue are collected from different age groups to keep diversity like in age group of 32-89 years, 23-89 years. The 25 men and 22 women are participated from each group. Signals covers 100,000 annotated heartbeats identifying types like: Normal sinus rhythm, Premature ventricular contractions (PVCs), Supraventricular premature beats, Atrial fibrillation (AFib) with wide range of arrhythmia such as ventricular ectopy and supraventricular ectopy. The dataset includes artifacts for example muscle noise, electrode motion, electrical noises hence a proper de-nosing is required to improve signal quality. Instead being over four decades old, its impressive structure with robust design and comprehensive annotations make it a valuable for research application. Research is more effective using with ML and DL techniques to perform training, validation and benchmarking in Arrhythmia classification, Signal quality assessment, Real-time ECG monitoring, Signal Processing. Dataset is ideal for testing methods like filtering and de-nosing for example wavelet transform, Kalman filters, Features extraction can be performed such as QRS detection, P-wave analysis. System developed on this dataset are deployed in ECG devices and healthcare system. It can be validated through data from wearable devices.

To conduct a detailed time series analysis of the ECG signals from the dataset of MIT-BIH Arrhythmia, the following steps are followed and the explanation are given that contributes the deeper understanding of data.

1. Pre-processing of ECG Signals De-noising: ECG signals often contain noise, such as powerline interference, motion artifacts and baseline wander. Use filters (e.g., high-pass filters for baseline correction, low-pass filters for high-frequency noise). Wavelet de-noising or Savitzky-Golay filtering may also be effective. Resampling has been done for the inconsistent sampling rates, resample them to a uniform rate (typically 360 Hz for MIT-BIH). Normalization is done to distribute the signals amplitudes uniformly through out across dataset.

2. Time-Domain Analysis is performed to detect R-Peak: Analysis is performed by using algorithms like Pan-Tompkins or wavelet transforms to detect R-peaks, which are crucial for analyzing heart rhythms. The heart rate variability(HRV) is the important parameter to know the variation of heart rate by performing calculation of RR intervals(interval between successive R-peaks) metrics like the mean RR interval, root mean square of successive differences (RMSSD), and standard deviation. Calculation of statistics is followed by Wave Segmentation, the process involves identification of individual ECG components such as P-wave, T-wave, and QRS complex. This can be done using derivative-based or template-matching methods.

3. In this step features like statistics, morphology of the data are extracted, Statistical and morphological features such as: Time-domain features: RR intervals, QRS duration, PR interval, and QT interval. Amplitude features: Peak amplitudes of P, QRS, and T waves. Some frequency-domain features are laso investigated like Power in different frequency bands. Transform ECG signals into feature vectors for further analysis, such as classification or clustering.

The component architecture of the proposed model is shown in Fig. 3, where ‘Q’ is the quantizer to achieve encoding and decoding.

Fig. 3
figure 3

Proposed Model Component architecture.

Newer data indicate that early on, rhythm regulation therapy, begun following the latest clinical analysis of AF, can strengthen Cardio Vascular Disease (CVD) effects as well as fatality in patients with AF and CVD risk elements. These kinds of results will modify the strategy and practice of rhythm administration. Rhythm administration should incorporate determined efforts at rhythm regulation in patients with new AF; however, it also consists of rate regulation to prepare patients for recurrences of AF. Hence, it is necessary to enhance the classification performance by developing a new algorithm.

In addition, the QRS complex and R-R span are important waveforms among the ECG signal components. They demonstrate the heart’s electrical process during ventricular compression, and their timing and pattern offer precious details regarding the heart’s current state. The usual ECG waveform is presented in subsequent Fig. 4.

Fig. 4
figure 4

Representation of the ECG signal interval.

QRS-complex (so that the R-R interval) identification provides the requisites for almost every ECG cycle analysis strategy. Apart from this, earlier research verifies that in automatic ECG classification, applying a DL approach focusing on the structure of the ECG signal component is extremely crucial for particular cardiac conditions. To draw out the R-R span from the ECG, R- R-peaks must be identified appropriately by applying QRS-complex recognition. Also, the accuracy of existing research classifiers needs to be improved. Hence, a new algorithm named the ’Self-attention-AE’ algorithm with Adam optimizer was developed during the proposed research. The proposed algorithm automatically fine-tunes hyperparameters until the accuracy improves. The Adam optimizer is used with dense neural networks in such a way that it connects every layer with another layer; however, it also considers every layer output as a feedback loop input. Hence, there are three layers. Then, layer-1 output will be fed to layer two as an input; however, layer-1 output will be fed back to summate the further two layers’ output. At this point, the Adam optimizer can evaluate all hyperparameters for optimum results to improve the overall effectiveness of the proposed ’Self-attention-AE’ model.

Fig. 5
figure 5

Proposed Self-attention-AE Model.

The dataset from MIT-BIH is the best-accepted dataset worldwide, so we tested and trained the proposed model with this dataset. It consists of two-channel portable ECG details acquired from 48 subjects. An ultimate 15-beat observation showing many arrhythmias is designated to the ECG heartbeats’ P, Q, R, S, and T-peaks. Data is pre-processed using a proposed method and divide the dataset into two sets for training and testing. Further, arbitrarily random division is applied to the entire dataset. Following is the pseudo-code for executing the proposed self-attention-AE model. (Refer following Fig. 5)

The proposed system collaborated with LSTM and the Self-attention-AE algorithm to enhance the accuracy of cardiovascular disease prediction, which further encodes and decodes the values to minimize reconstruction errors. The training and validation are repetitively executed to verify the increase in accuracy for different threshold values. The proposed model is trained to lower the reconstruction loss by minimizing the difference between P and P’. Reconstruction losses are given by Mean Standard Error (MSE) for real-value inputs, where MSE is the minimum standard error. The dimensionality of Q is usually less than P; hence, autoencoders are also called bottleneck neural networks. The self-attention-AE algorithm eliminates this bottleneck in the proposed system through recurrent training and validation. Lastly, the overall self-attention-based system is evaluated by comparing it with the results of the existing system. The step-by-step discussion will be accomplished in subsequent sections.

ECG data acquisitions

During the proposed research, the MIT-BIH Arrhythmia database is referred from https://www.physionet.org/. This database has over 4000 long-term ECG Signal recordings41.

Proposed power line interference denoising method

PLI is a type of ’noise’ present in the ECG data and is triggered by the electrical source current coursing in wires, cables and power lines. The PLI that exists in the ECG signals is comprised of harmonics. An appropriate filter selection has been made to reduce harmonics. Before proceeding with the self-attention AE model, a modified Kalman filter was applied to resolve the problem of PLI, which is discussed in Section 3.3.1. Hence, the denoising is achieved during the preprocessing of the ECG signals.

Mathematical model

Generally, the discrete wavelet transforms (DWT), notch filter, and Kalman filter is used for denoising. However, as the Kalman filter response provides PLI during the attenuation, we modified the Kalman filter to improve the signal quality by denoising ECG signals. Hence, the proposed research developed a modified Kaman filter for denoising ECG signals, and performance is compared with a notch filter. The proposed modified Kalman filter module aims to reduce the error in the Kalman filter to achieve a more accurate denoised ECG signal. We start by reducing the state estimation error as a standard procedure in the Kalman filter. Subsequently, we focus on reducing the state prediction error sensitivity, significantly reducing error. As an arbitrary variable, this sensitivity is effectively minimized by reducing its squared mean and covariance trace.

PLI is immediately identifiable as the interfering voltage in the ECG, which has a frequency of 50 to 60 Hz. The disturbance may be caused by the wayward impact of the alternating current caused by loops in the patient’s cords. The proposed method uses a wavelet transform, which collaborates with the MKF processing, to cancel PLI by the time-frequency of non-stationary signals. Further sections discuss the analytical model for the dynamic and Kalman filters. It is observed that single-tone PLI noise interference consists many sinusoids signals with unpredictable phase and amplitude values as:

$$\begin{aligned} S_n = A \cos (2pi \times nf_m \times f_s+\theta ) \end{aligned}$$
(1)

The equation 1 is updated with addition of error model and by performing trigonometric manipulation it can represented as in equation 2.

$$\begin{aligned} S_(n+1) + S_(n-1) =2 \cos (2pi \times \frac{f_m}{f_s})S_n + \xi _n \end{aligned}$$
(2)

This is observed that PLI noise does not change abruptly in terms of phase or amplitude. Therefore Pure ECG is generally distorted by PLI noise contains a mixture of pure ECG, undesired disturbance and PLI noise itself, and it mentioned in equation 3:

$$\begin{aligned} Y_n = S_n + W_n \end{aligned}$$
(3)

where \(S_n\) and \(W_n\) are representing frequencies of PLI noise 50 and 60 Hz respectively. All signals and noises except PLI is represented by zero means and arbitrary term. In the proposed work PLI noise of 50 and 60 Hz are considered hence there zero possibilities of biosignals and other disturbance. The tracking of PLI noise is made predictable with the transformation of equations 2 and 3 in the form of state-space to use the KF technique as given in equation 4

$$\begin{aligned} & X_(n+1) = C_xn + C_gn \end{aligned}$$
(4)
$$\begin{aligned} & Y_n = d^T x_n + v_n \end{aligned}$$
(5)

The Kalman filters functions on system modeled in a state space form which includes a state-update equation for tracking the internal states of the systems. Another equation is used to observe the measure signal to the internal states. Therefore the Kalman filter is used to track the filter out PLI by designing a Kalman Filter model base on state space representation of the system. In equation 4, X(n+1) represents state vector at time step (n+1), C is state transition matrix that describes how the system evolves over time. \(x_n\) represents state vector at current time, \(g_n\) represents input noise or disturbance affecting the state transition. The equation 5 is know as observation equation. \(Y_n\) is observed signal (eg. noisy signal ) at time n. \(d^T\) is an observation matrix that maps the state vector \(x_n\) to the observed signal. \(v_n\) is the measured noise or observed noise. The Kalman filter can estimate the state vector \(x_n\) that represents the clean ECG signals and PI components. The output \(Y_n\) is adjusted to remove the tracked PLI components leaving a denoised ECG signal.

The model calculates power line interference by applying a noisy ECG signal to the model to get a denoised signal.

ECG pre-processing using modified kalman filter

A Modified Kalman filter is a mathematical process that identify the state of a noisy system. The filter has two important component, first is prediction, it uses the state transition for the prediction of current state based on prior state with any of the control input, if applicable. Second is update state which finally update the current calculated value. The equation 6 has shown this calculation.

$$\begin{aligned} \hat{X}_k^- = F_k \hat{X}_k-1^ + B_k u_k-1 \end{aligned}$$
(6)

The state transition matrix is the way to represent the relation between the current state to the previous state similarly control input matrix represents states of control input. During the prediction phase, the error covariance matrix is computed for the uncertainty in the predicted state estimate, the equation 7 is representing this update :

$$\begin{aligned} P_k^- = F_k P_(k-1) F_k^T + Q_k \end{aligned}$$
(7)

The state transition model uncertainty is represented by the noise covariance matrix. The combination of measured and predicted state are used to provide updated state estimate and error covariance matrices as shown in equation 8.

$$\begin{aligned} \hat{X}_k = \hat{X}_k^- + K_k (Y_k - H_k \hat{X}_k^-) \end{aligned}$$
(8)

according to equation 8, the time k is represented by \(Y_k\), the measurements to the state are related with the observation matrix. To estimate the updated state, the role of modified Kalman filter is important to determines the weight assigned to the measurement state in the updated state and predicted state estimate. The equation 9 is used to update the covariance matrix.

$$\begin{aligned} \hat{P}_k = (I - K_k H_k)P_k^- \end{aligned}$$
(9)

where I represents the identity matrix. The KF is used to apply for determining PLI when dynamic model is defined using the equations 10111213

Time Propagation given by:

$$\begin{aligned} P_n+1 = CP_n^+C^T+qncc^T \end{aligned}$$
(10)

Kalman Gain is given by:

$$\begin{aligned} H_n =[\frac{{P^n d}}{{d^TP_nd+K_n}}] \end{aligned}$$
(11)

Measurement Propagation is given by:

$$\begin{aligned} \hat{X}_n^+ = \hat{X}_n^- +H_n [Y_n-d^T\hat{X}_n^-] \end{aligned}$$
(12)

Now Resultant ECG signal has following value.

$$\begin{aligned} \hat{P}_n^+ = \hat{P}_n^- - H_n \hat{d}^T \hat{P}_n^- \end{aligned}$$
(13)

On the application duel denoising using KF in the noisy ECG signal, the resultant signal is used to filter the ECG signal by subtracting the tracked PLI from noisy ECG. The procedure is explained in Fig. 6.

Fig. 6
figure 6

Denoising of 50 and 60 Hz PLI from ECG using KF representing Block diagram.

The state space model for a dynamic system can be expressed as:

$$\begin{aligned} X_k = F_k X_(k-1) + B_k u_k + W_k \end{aligned}$$
(14)

Further, the initial filtered signal output \(X_k\) is fed to the second phase filter to get more denoised output \(Y_k\).

$$\begin{aligned} Y_k = H_k X_k + V_k \end{aligned}$$
(15)

\(Z_k\) is generated by combining output signals of \(X_k\) and \(Y_k\).

$$\begin{aligned} Z_k = X_K + Y_k \end{aligned}$$
(16)

Lastly, the final denoised signal \(Z'_k\) is generated by the cumulative denoised signal outputs \(X_k\) and \(Y_k\).

$$\begin{aligned} Z'_k = X_K + Y_k + Z_K \end{aligned}$$
(17)

The relation between control input matrix and control input is established and similarly the observation matrix is related with calculation to the state. The previous state output is fed to the current state as an input to manage the continuous prediction model. Unlike the Kalman filter state model, in the modified Kalman filter, we used a feedback loop (for recurrent corrections throughout all previous and current state prediction values) to verify the denoising of initial signals (containing harmonics) by comparing the QRS-complex and R-R interval peaks. (Refer 14 to 17). Further, the modified Kalman filter state correction model is shown subsequently in equations 18 and 19.

$$\begin{aligned} \hat{X}_k^- = F_k \hat{X}_k-1 + B_k u_k + Z'_k \end{aligned}$$
(18)
$$\begin{aligned} P_k^- = F_k P_k-1 F_k^T + Q_k + Z'_k \end{aligned}$$
(19)

Where, prediction observations till the calculation phase \(k-1\), \(\hat{X}_k^-\) is the approximated state of the prediction at given time k. Further, \(k-1\), \(P_k^-\). is the predicted correction error covariance matrix at time k. \(k-1\), \(\hat{X}_k-1\) represents the actual estimate for the state of the system at time \(k-1\). Steps involved in updating each state at the run time test are given by equations 20 to 22:

$$\begin{aligned} & K_k = P_k^- H_k^T \left[ \left( HP_k^- H^T + R_k \right) \right] ^-1 \end{aligned}$$
(20)
$$\begin{aligned} & \hat{X}_k = \hat{X}_k^- + K_k \left( Y_k - H_k \hat{X}_k^- \right) \end{aligned}$$
(21)
$$\begin{aligned} & P_k = \left( I - K_k H_k \right) P_k \end{aligned}$$
(22)

To estimate the predicted state there is requirements of weights which are identified during correlation among weights with the help of modified Kalman gain matrix \(K_k\) at time k. At time k the observations \(\hat{X}_k\) estimate the updated state and updated error covariance matrix \(P_k\).

The procedure is repeatedly estimate the systems’s state and update it with prediction. The update stage incorporates the measurement (per the estimate) and the observation matrix to correct the prediction using the state transition model to predict the system’s current state.

Current state prediction using MKF

The procedure of the proposed algorithm is mentioned in this section to denoise the ECG using Kalman filter by combining two variables in two steps.

  • Previous signal records of P, Q, R, S, and T peaks of ECG waveform (here, the current state estimate is identified from the previous state).

  • ECG peak State prediction.

The modified Kalman filter combines the current state estimation by approximating the noise level of signals. Fig. 7 explains the block description for the current state update.

Fig. 7
figure 7

Block representation for Current State Update using MKF.

The following steps for state prediction are:

Step 1: Initialization: Only one time initialization is performed in the step to produce two .

  • Initial state of System is \(\hat{x}_{0,0}\)

  • Initial Variance of the System is \(p_{0,0}\)

Step 2: Current State Prediction: This step is performed to estimate the state of system’s current state with state update inputs as:

  • Measured Value of the state \(z_{n}\)

  • A Measured Variance of the state \(r_{n}\)

  • The Predicted System State Estimate \(\hat{x}_{n,n-1}\) calculated prior.

  • The Predicted System State Estimate Variance \(p_{n,n-1}\) calculated prior.

The Modified Kalman filter Gain is calcualted by considering the inputs and leads the results as:

  • Estimation of the Current System State\(\hat{x}_{n,n}\)

  • Estimation of variance of the Current State\(p_{n,n}\)

According to proposed system model the current system’s state is identified fed into the next system’s state. During the iteration, the state is considered prediction state and the predicted outputs are applied to accomplish the current state estimate using the filter iteration variance feedback. Fig. 8 shows the current state prediction graph with the Modified Kalman filter (MKF).

Fig. 8
figure 8

Current state prediction with Modified Kalman Filter.

In continuation, the two sets of recording has been collected from the MIT-BIH Arrhythmia database. The first set contains twenty-three the recordings selected randomly and the second set contains same number of recording selected randomly. The samples with range of 10 mV are digitalized with rate of three hundred sixty times per second on each channel. The annotation has been performed for two or more heartbeats separately for each recording. Annotation process is done to produce computer understandable reference from the different annotation used by the cardiologists. There are approximately 110,000 annotation has been generated and included in the database with the label of 100,103 and 105 for denosing and filtering purposes.

Observed results with 60 Hz PLI

The reconstructed ECG signal is further filtered and contrasted qualitatively with the help of output SNR. The Fig 9 illustrate the output after the application of noise reduction.

Fig. 9
figure 9

Clean ECG.

As discussed, Fig. 9 represents a clean electrocardiogram, showing the ECG signal without interference so it is easily readable. An ECG signal with interference is shown as a noisy signal, as shown in Fig. 10. Such signal interference influences the precision diagnosis of cardiac arrhythmia.

Further, we generated noises at various levels for testing purposes, which are discussed here.

Fig. 10
figure 10

Noisy ECG.

In the Fig. 10, the noisy signal is generated with the combination of 60Hz PLI noise and ECG 100 record at 15dB SNR. The signal is filtered with notch filter and the result is is show in the Fig 11. The output of de nosing process proposed is shown in the Fig 12.

Fig. 11
figure 11

Notch Filter.

Fig. 12
figure 12

Result based on Proposed denosing method for 60Hz ECG 100 at 15 dB.

Although the performance of Notch filter is good but it is observed that declination of R peaks for certain samples in case of low SNR as shown in Fig 11.

The Proposed MKF filtration technique has proved its effectiveness by successfully removing the 60Hz PLI noise from the ECG and also keep the pattern of ECG signal intact as shown in Fig 12. This is also keep into view that proposed method outperform the notch filtering technique in terms of maintaining the diagnostic information of the ECG. It is needed to perform quantitative examination of the output SNR values to gain more deep insight.

The proposed method pre-processes the ECG signals and classifies their P, Q, R, and S elements. After the denoising and pre-processing, we applied the proposed algorithm to classify the ECG waveform elements.

Proposed algorithm execution steps: self-attention-AE model

The following steps for algorithm execution for the classification and prediction of cardiac arrhythmia.

figure a

The Self-attention-AE Auto Encoder model is compared with existing algorithms for precision, recall, and accuracy parameters. Our approach involved a meticulous application of Keras and Tensorflow (as a backend) to execute the proposed ‘Self-attention-AE model, ensuring the robustness of our model. Anaconda-3, with pa unique signal processing approach carrying through the proposed model. This precision has allowed us to confidently generate classification results and confusion matrix graphs using Tensorboard. To execute our experiment, we used an Intel Core i5-7400 CPU with an NVIDIA graphic card, 8GB memory, and 8GB RAM. The pre-processed dataset is split into training and testing sets by a random split-up technique.

In this study, we introduce a novel self-attention-AE model comprising an encoder and a decoder, a unique signal processing approach. The encoder module features four 1-D LSTM convolutional layers, while the decoder module includes a single 1-D deconvolution (Deconv) layer. The network uses a noisy ECG signal and produces a denoised ECG signal through a modified Kalman filter. Feature maps are extracted using LSTM convolutional layers, and downsampling is achieved using a max-pooling layer of two, effectively suppressing noise during the encoding phase. The detailed parameters are shown in following Table 1.

Table 1 Detailed Parameters of the proposed model with Adam optimizer

After the learning phase, the convolutional layers are transposed, which upsampled layers during the decode phase for compressed ECG. Further, the softmax function is used as the last activation function to normalize the output of a network to achieve a probability distribution over predicted output classes, which compiles the self-attention feature map to deliver the classification of ECG signals. Accordingly, the proposed algorithm meticulously evaluates the separate P, Q, R, S, and T waves for a single heartbeat. The results, as depicted in Fig. 13, are significant.

Fig. 13
figure 13

Classification of ECG signal for single beat sinus.

Identifying the ’Q’ wave element, a known risky heartbeat among random heartbeats, holds life-saving potential, especially in co-morbid patients. This crucial identification aids in predicting the patient’s cardiac arrhythmia condition.

Results and analysis

In this section, the proposed system comprises two steps. We discussed modified Kalman filter (MKF) performance as a preprocessing step, and a self-attention autoencoder is presented for further classification and prediction of cardiac arrhythmia. We tested MKF for -5db, 0db, and 5db; the comparison results are shown in Tables 2, 3 and 4 (Here, ’NR’ means values Not Reported by the author.). The results show that the denoising of ECG signal using the proposed Modified Kalman Filter by duel signal filtration performs better than the existing models.

Table 2 MKF Performance comparison at -5db.
Table 3 MKF Performance comparison at 0db.
Table 4 MKF Performance comparison at 5db.

The comparative analysis shows that the proposed self-attention-AE model outperforms existing system models.

We accomplished training and testing for various cardiovascular disease-prone types. The total trainable parameters are 247,937 and are repeated using the self-attention-AE to minimize reconstruction errors. The proposed self-attention-AE is trained on ECG time series signals’ 625004 samples and validated on 15602 samples with an epoch size of 5. The LSTM sequential execution with the proposed self-attention-AE is shown in the following Fig. 14.

Fig. 14
figure 14

Self-attention-AE Model.

After epoch 30, the training and validation model loss is compared, as shown in the following Fig. 15. As the loss is near zero, we can say that the model’s prediction outperforms the proposed system.

Fig. 15
figure 15

Training and validation model loss for proposed system.

Further, as shown in Fig. 16, the classification error value recorded for different data points of reconstruction error segments implies that the classification error is very low, which shows that the outcome of cardiovascular disease prediction is accurate. The proposed system intends to use repeated training and validation to lower the classification errors so that the reconstruction error after self-attention-AE execution shows the best results by further lowering the errors. For standard parameters like accuracy, precision, recall, and F1-score analysis, which are crucial in evaluating the performance of a machine learning model in predicting cardiovascular diseases, we applied the following formulae:

$$\begin{aligned} & Accuracy = \left( \frac{TP+TN}{(TP+TN+FP+FN)}\right) \end{aligned}$$
(23)
$$\begin{aligned} & Precision = \left( \frac{TP}{(TP+FP)}\right) \end{aligned}$$
(24)
$$\begin{aligned} & Recall = \left( \frac{TP}{(TP+FN)}\right) \end{aligned}$$
(25)

Where ‘TP’ is True Positive, ‘FP’ is False Positive, and ‘FN’ is False Negative.

Fig. 16
figure 16

Classification Error for the proposed system.

As shown in Fig. 17, the reconstruction errors for different classes are much less, so the proposed model achieved an accuracy of 99.71%. The reconstruction error was eliminated using the self-attention-AE repeated evaluation strategy.

Fig. 17
figure 17

Reconstruction Error for the proposed system.

Further, Fig. 18 shows the recorded precision and recall for different threshold values.

Fig. 18
figure 18

Recorded precision and recall values.

Based on the training and validation of different classes, normal and break labels show the confusion matrix evaluation results, which are shown in Table 5.

Table 5 Confusion Matrix for proposed Self-attention-AE.

Fig. 19 shows the confusion matrix generated by the experimental execution.

Fig. 19
figure 19

Confusion Matrix Evaluation Result.

The accuracy of 99.71% indicates that the proposed system is very accurate for predicting cardiovascular diseases by classifying cardiac arrhythmia.

Further, the proposed system’s performance is compared with the existing systems. Its precision, recall, and accuracy performance are compared with the existing system results, as shown in Table 6.

Table 6 Comparative Analysis of Proposed Self-attention-AE with Existing Methods.

As per Fig. 19, the confusion matrix depicts the values mentioned in Table 5; the proposed system achieved an exceptional accuracy of 99.71% for ECG Time Series Signal data, specifically for R-R peaks of ECG signals. Combined with almost zero classification errors, this performance sets a new standard in cardiovascular disease prediction. Compared to the models detailed in Table 6, the proposed system outperforms.

Discussion

There are useful predictions of the suggested approach on high-quality labeled ECG data, its ability for wider use is limited. Transfer-based learning is effective with the large dataset(such as the self-attention mechanism), which might result in decreased performance when used with sparse or unbalanced data.

This can be tougher to gather. such huge amounts of data in medical situations where a variety of demographic data may not easily accessible, or it could be expensive and time-consuming to gather and annotate. Transformer based self-attention mechanism50,51, Real-time deployment is made possible by the transformer-based self-attention mechanism’s layer that extracts features to enhance performance, increasing the need for processing resources like GPU and TPU to handle their complexity. Challenging to connect wearable health monitoring or mobile devices. Thus, it is noted that For real-time cardiac monitoring, accuracy and computing efficiency are traded off is observing52,53. This model’s efficacy in a variety of clinical populations remains difficult, despite its ability to function well in constrained experimental settings. To assess the model’s performance in terms of generalization and robustness Clinical research across a range of demographics is necessary.51. Improving the model’s applicability across a variety of demo graphics requires expanding the variety of ECG datasets.Studies show that the efficiency of AI models for ECG analysis signals is heavily influenced by the data’s variety, as these datasets usually do not encompass a variety of therapeutic situations and populations. The capacity of the model ability to generalize to different age groups, ethnicity, and cardiovascular conditions states, all of which are commonly underrepresented in datasets from early research, will be improved by expanding the dataset’s diversity. This tactic is well acknowledged. In the field, with recent research emphasizing data augmentation as an essential technique to improve model flexibility and accuracy for larger ECG applications54.

Effective adoption of AI-driven ECG classification algorithms by healthcare practitioners depends on their integration into clinical procedures. Although self-attention algorithms and autoencoder structures, such as the one that has been suggested, provide accuracy in the classification of arrhythmias, they necessitate an interface that is compatible with both current medical devices and Electronic Health Records (EHR). By adding updated Kalman filters to this AI model for noise reduction, the clarity of the ECG signal can be further improved, improving diagnostic precision. According to studies, integrating AI technology with clinician workflows can increase adoption rates, facilitate decision-making, and ultimately improve patient care without flooding the system with unnecessary data or warnings. Recent trials and validation studies have shown that models must be created with an easy-to-use implementation in mind in order to optimize the impact of AI in clinical ECG categorization. For example, integrating AI-based ECG diagnostics into widely used devices, such as ECG-enabled stethoscopes, has demonstrated potential for improving clinician accessibility in real-time situations. Furthermore, research indicates that incorporating AI into clinical workflows can improve screening effectiveness and speed up diagnosis, making it a potentially revolutionary tool in cardiac care with careful use56,57. Computational optimization is the required part of the real time applications that balances between efficiency with diagnostic accuracy therefore this is used to enhanced the proposed ECG classification model. It is hard to keep the usage of resources under limit while using Transformer-based self-attention mechanism for powerful feature extraction. Various research are investigated and it is observed that some of the approached are useful such as refining attention mechanism to reduce parameter load, and integrating low-complexity models that maintain the performance levels while lowering computational demands. Research proved that by streamlining the self-attention layers though limiting the number of parameters load and adopting multi-layer efficiency strategies can reduce processing time without compromising accuracy. The approaches discussed so far are enable model to operate effectively on resource-constrained devices like mobile processors, which are required in real-time monitoring settings55.

Conclusion

As per the research gaps identified during the literature studies, the proposed cognitive computing approach, by using the proposed Self-attention-AE system with a novel modified Kalman filter, we noted the fast decision with enhancement in the accuracy of 99.7% for prediction of ECG signals along with precision of 99.91%, recall of 99.86%. The self-attention-AE neural network proved best for classification results with minimum classification errors. Reconstruction by repeated execution for training and validation lowered the reconstruction errors, making the arrhythmia prediction possible even for a single heartbeat QRS complex and R-R peak evolution. In healthcare, it is important to analyze the anomalies quickly and accurately to decide the line of action for the patient’s treatment. The proposed research outperformed the latest existing systems per the comparative analysis results. Hence, further to the proposed system, future development can be done in the direction of denoising the ECG signals repeatedly by repeated reconstruction of the ECG signals.

The paper concludes by presenting the suggested self-attention autoencoder algorithm as a very promising technique for using ECG signal analysis to classify cardiac arrhythmias. Clinical diagnostic tools have advanced significantly as a result of its high accuracy and potential for automating ECG interpretation. Deep learning self-attention mechanisms, especially those involving masked autoencoders, have demonstrated efficacy in enhancing model accuracy and feature extraction. These mechanisms are crucial for ECG interpretation, particularly when it comes to identifying complex arrhythmias and minimizing reliance on human expertise in clinical settings. This is consistent with new studies showing that self-attention mechanisms can identify temporal relationships in ECG signals, enhancing categorization and lowering the requirement for manual intervention in cardiac diagnostics. Notwithstanding its successes, additional testing to verify the model’s generalizability across various patient populations and real-world situations will be necessary before it can be used in therapeutic settings. Research keeps highlighting how crucial it is to thoroughly evaluate these algorithms, pointing out that sophisticated deep learning techniques like transformers and autoencoders need sizable, varied datasets in order to guarantee reliable findings across a range of clinical settings .