A personalized federated learning-based glucose prediction algorithm for high-risk glycemic excursion regions in type 1 diabetes

Darpit, Dave; Vyas, Kathan; Jayagopal, Jagadish Kumaran; Garcia, Alfredo; Erraguntla, Madhav; Lawley, Mark

doi:10.1038/s41598-025-22316-4

Download PDF

Article
Open access
Published: 03 November 2025

A personalized federated learning-based glucose prediction algorithm for high-risk glycemic excursion regions in type 1 diabetes

Dave Darpit^1,3,
Kathan Vyas²,
Jagadish Kumaran Jayagopal²,
Alfredo Garcia¹,
Madhav Erraguntla¹ &
…
Mark Lawley¹

Scientific Reports volume 15, Article number: 38376 (2025) Cite this article

2456 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Continuous glucose monitoring (CGM) devices allow real-time glucose readings leading to improved glycemic control. However, glucose predictions in the lower (hypoglycemia) and higher (hyperglycemia) extremes, referred as glycemic excursions, remain challenging due to their rarity. Moreover, limited access to sensitive patient data hampers the development of robust machine learning models even with advanced deep learning algorithms available. We propose to simultaneously provide accurate glucose predictions in the excursion regions while addressing data privacy concerns. To tackle excursion prediction, we propose a novel Hypo-Hyper (HH) loss function that penalizes errors based on the underlying glycemic range with a higher penalty at the extremes over the normal glucose range. On the other hand, to address privacy concerns, we propose FedGlu, a machine learning model trained in a federated learning (FL) framework. FL allows collaborative learning without sharing sensitive data by training models locally and sharing only model parameters across other patients. The HH loss combined within FedGlu addresses both the challenges at the same time. The HH loss function demonstrates a 46% improvement over mean-squared error (MSE) loss across 125 patients. Compared to local models, FedGlu improved glycemic excursion detection by 35% compared to local models. This improvement translates to enhanced performance in predicting both, hypoglycemia and hyperglycemia, for 105 out of 125 patients. These results underscore the effectiveness of the proposed HH loss function in augmenting the predictive capabilities of glucose predictions. Moreover, implementing models within a federated learning framework not only ensures better predictive capabilities but also safeguards sensitive data concurrently.

Deep transfer learning and data augmentation improve glucose levels prediction in type 2 diabetes patients

Article Open access 14 July 2021

Enhancing severe hypoglycemia prediction in type 2 diabetes mellitus through multi-view co-training machine learning model for imbalanced dataset

Article Open access 30 September 2024

Explainable hypoglycemia prediction models through dynamic structured grammatical evolution

Article Open access 01 June 2024

Introduction

A peptide hormone known as insulin is produced by the beta cells in the pancreases. Insulin plays a crucial role in regulating the transport of glucose and facilitating its absorption by cells from the bloodstream for energy production. Any disruption in this process leads to a medical condition called Diabetes mellitus (DM), more commonly referred as diabetes. Globally, this chronic disease affects nearly 537 million lives with an additional 374 million classified as prediabetic, putting them at similar risks of developing complications¹. Diabetes can lead to both, immediate and long-term health issues, including blindness, kidney failures, amputations, heart-related diseases, seizures and in severe cases, even death^2,3,4. Diabetes is broadly categorized into type 1 and type 2. In type 2 diabetes, cells become resistant to insulin, hindering their ability to absorb glucose. Type 2 diabetes constitutes 95% of the diabetes population and is typically prevalent in older age-groups. On the other hand, Type 1 diabetes arises when the pancreas fails to produce sufficient insulin. Type 1 diabetes accounts for about 5% of the diabetes population and is mainly found in age groups 0–20 years of age. However, it requires a much more rigorous diabetes management to avoid the imminent consequences. The overarching goal of diabetes management is to maintain euglycemia (normal glucose levels) or increase time spent within the optimal range (time-in-range, TIR)². Insulin therapy is the mainstay of diabetes management, requiring a delicate balance between short-term and long-term objectives^5,6. While diabetes cannot be cured, proper insulin management allows patients with diabetes to keep optimal blood glucose levels and minimize the risk of further complications⁷.

The most widely used method for glucose monitoring involves the use of fingerstick technique, where a small blood sample is drawn by pricking the finger and analyzed using the glucometer^8,9. This was revolutionized by Continuous Glucose Monitoring (CGM) devices, that offer real-time glucose monitoring with automated readings. This has evidently led to increasing time spent by patients in TIR, a crucial metric in diabetes care^10,11. CGM devices have proven to enhance glycemic control by reducing both hypoglycemia and hyperglycemia excursions through real-time predictions^12,13. While previous research for type 1 diabetes has primarily focused on predicting hypoglycemia events^{10,14,15,16,17,18,19}, achieving glycemic goals requires maximizing glucose readings in TIR by avoiding both hypoglycemia and hyperglycemia.

Efforts to develop machine learning (ML) models for glucose predicting using historical CGM readings have shown reasonable efficacy. These models typically rely on extensive training data, often gathered through large-scale clinical research studies or healthcare providers. The data is often stored in cloud-based servers and poses significant privacy risks, with concerns about data theft and user privacy²⁰. Given the escalating concerns of data theft, user security and privacy, recently many countries have enforced regulations aiming to address these concerns and protection regulations to protect the interests of the people^21,22,23. An alternative strategy is to develop individual models with personal data to enhance privacy, but this comes with the drawback of reduced prediction performance due to limited data. ML and deep learning models require substantial data for optimal performance.

Thus, to overcome the challenges in glycemic excursion prediction and privacy concerns, we propose a collaborative learning framework that addresses these concerns simultaneously. The major contributions of this paper are as follows:

A novel, HH loss function, for improved prediction performance in the glycemic excursion regions.
FedGlu, a personalized machine learning model trained in a federated learning framework for improving model performance and data privacy preservation simultaneously.

The remainder of the paper is organized as follows: Section II reviews the current literature and points out relevant papers. Section III describes the dataset in detail. It also explains the methods like HH loss function, and the federated learning used as part of the paper, different metrics used for evaluation and the different setup of experiments in the paper. Section IV compares the results derived for the different experiments with the HH loss function, comparison between central, local, and personalized federated models. Section V provides discussion and insights on the results, limitations, and future work for the paper. Lastly, Section VI concludes the paper.

State of the Art

In this section, we provide a brief overview of the existing literature on glucose prediction. While a comprehensive review may be out of the scope of this paper, we emphasize works that have focused on predicting glucose values with deep learning approaches, use of federated learning for healthcare applications and extending the global federated model through fine tuning for personalization.

CGM-based glucose prediction

The first attempt to predict future glucose levels using past values dates back to 1999 by Bremer and Gough²⁴. Since then, researchers have continually advanced the literature to develop powerful and highly accurate models. Machine learning models for glucose prediction fall into two categories: (a) Classification tasks, (b) Regression task. Consistent with the theme of our paper, we focus on the regression tasks. Earlier approaches utilized conventional machine learning methods such as linear regression (LR), support vector regression (SVR), random forests (RF), boosting algorithms^25,26,27,28 and time-series forecasting methods like autoregressive integrated moving average (ARIMA)^29,30 for predicting glucose levels. In recent years, there has been a shift towards employing deep learning (DL) models, leveraging automated feature learning, robust pattern recognition, and abstraction through multiple layers.

Various DL architectures, including multi-layer perceptron (MLP)^31,32, convolutional neural networks (CNN) and convolutional recurrent neural networks (CRNN)^33,34, recurrent neural networks (RNN)^35,36, short long-term memory networks (LSTM)^37,38,39, dilated RNNs^40,41 and bi-directional LSTMs^42,43,44, have been proposed for glucose prediction.

Shuvo et al.⁴⁵ introduced a deep multi-task learning approach using stacked LSTMs to predict personalized glucose concentration. The proposed approach includes a combination of stacked LSTMs to learn generalized features across patients, clustered hidden layers for phenotypical variability in the data and subject-specific hidden layers for optimally fine-tuning models for individual patient improvement. The authors demonstrate superior results compared to state-of-the-art ML and DL approaches on the OhioT1DM dataset.

A transformer based on an attention mechanism was recently proposed to forecast glucose levels and hypoglycemia and hyperglycemia events⁴⁶. The proposed transformer network includes an encoder network to perform the regression and classification tasks under a unified framework, and a data augmentation step using a generative adversarial network (GAN) to compensate for the rare events of hypoglycemia and hyperglycemia. Results were demonstrated on two datasets, one including type 1 diabetes patients and the other on type 2 diabetes patients.

In pursuit of enhanced prediction performance, researchers have often grappled with the constraint of limited data for individual subjects or patients. A recently proposed approach addresses this challenge through multitask learning for advancing personalized glucose prediction⁴⁷. This approach was evaluated against sequential transfer learning, revealing two key findings: (a) individual patient data alone may not suffice for training DL models and (b) a thoughtful strategy is crucial to leverage population data for improved individual models. The dataset that was used in this study was the OhioT1DM dataset.

While the literature commonly employs standard regression metrics like root-mean squared error (RMSE), mean absolute error (MAE), and Clark’s Error Grid Analysis (EGA) for clinical context, an often-overlooked aspect in the analysis of these metrics in glycemic excursion regions. For instance, RMSE in hypoglycemia and hyperglycemia ranges can be evaluated, Clark’s EGA can be applied to glucose readings falling in Zones C, D, and E. Mu et al.⁴⁸ introduced a normalized mean-squared error (NMSE) loss function, demonstrating a substantial reduction in RMSE for the hypoglycemia range. However, comprehensive details regarding its superiority over MSE in detecting hypoglycemia are lacking, and its performance in hyperglycemia ranges is not addressed.

Data imbalance

There is a significant imbalance in glucose data distribution across the hypoglycemia, hyperglycemia, and normal ranges. Typically, only 2–10% of glucose readings fall into the hypoglycemia range, while about 30–40% fall in the hyperglycemia range. From a statistical perspective, this presents a classic case of an imbalanced regression problem. While numerous approaches have been developed to tackle the imbalanced data in the classification setting, very few works have been proposed to address the imbalance regression problem, like ours, in the literature⁴⁹. Existing approaches for imbalanced regression can be broadly categorized into two types:

Sampling-based approaches

These methods attempt to either under sample the high-frequency values or oversample the low frequency (rare) values. However, determining the ‘notion or rarity’ in a regression problem is challenging compared to a classification task. Oversampling may lead to overfitting, whereas under sampling may result in sub-optimal performance because of loss of key information. Chawla et al.⁵⁰ proposed an approach that generates synthetic samples by combining oversampling and under sampling of the training data for classification task.

Cost-sensitive approaches

Cost-sensitive approaches: These approaches introduce a penalizing scheme during training to enable the model to handle outlier values (low-frequency or rare values) enhancing its effectiveness in predicting within those ranges. The recent success of this approach^51,52,53 motivates us to explore this approach further through our customized loss function.

Federated learning for healthcare

Federated learning (FL) has substantial disruptive potential in healthcare, a domain constrained by sensitive data and strict regulations such as the Health Insurance Portability and Accountability Act (HIPAA) in the US. The reluctance of healthcare entities to share sensitive data has fueled the adoption of FL in various healthcare applications, like medical image processing^54,55,56, IoT-based smart healthcare applications⁵⁷, managing electronic health records (EHR)⁵⁸, disease prediction^59,60,61, predict hospitalizations and mortality^62,63,64, natural language processing from clinical notes^65,66,67, etc. A few researchers comprehensively reviewed federated learning in healthcare^68,69,70,71.

In the diabetes literature, FL has recently gained traction. A recent study proposed a deep learning approach in the Diabetes Management Control System (DMCS)⁷², using 30 virtual subjects from the FDA approved UVA/Padova Type 1 diabetes simulator. Features such as past glucose values, carbohydrate intake, and insulin-on-board were used as input for the model for the diagnosis of diabetes. The findings clearly demonstrate the superior performance of the federated model over the local models.

Another study employed an FL-inspired Evolutionary Algorithm (EA) for classifying glucose values into different risk categories⁷³ using data from 12 patients in the OhioT1DM dataset. The input features include previous CGM readings, carbohydrates, and insulin data. The results indicate improved performance over local models using an FL-based EA. However, the study’s limitation lies in its small sample size of only 12-patients and the absence of addressing real-time glucose-related risk-prediction and its clinical significance.

A decentralized privacy-protected federated learning approach was applied to predict diabetes-related complications using patient-related comorbid features extracted from International Classification of Disease (ICD) codes in real-world clinical datasets⁷⁴. For this, a logistic regression model, 2-layer multi-perceptron model and 3-layer multi-perceptron models were proposed and compared against a centralized (population-level) model. The study addressed class imbalance using techniques like under-sampling, oversampling, and balancing. The results indicate that models developed through the federated learning framework can achieve promising performance that is comparable to the centralized models.

Personalized federated learning

Federated learning has its many unique challenges^75,76,77, with one prominent issue being the variation in data distribution across clients in the network. This is characteristic of non-i.i.d. and imbalanced data⁷⁸. This is particularly significant in healthcare applications where individual patients possess diverse demographics and health histories, necessitating adaptive solutions tailored to each participant⁷⁹.

One of the pioneering works in federated learning with wearable healthcare data is FedHealth⁸⁰, which introduces personalization through transfer learning. This is achieved by first training a conventional global model in a federated learning framework and later fine-tuning two fully connected layers (through transfer learning) to learn activities and tasks for specific users. The study utilizes publicly available human activity recognition data from accelerometry and gyroscope data across 30 users for multiple activity class prediction. The authors employ a CNN based deep learning model, comparing its performance against traditional machine learning methods like RF, SVM, and KNN. The results demonstrate a 4% average improvement in performance for personalized models compared to the global model.

Another noteworthy application of personalized federated learning was used for in-home health monitoring⁸¹. The authors introduce FedHome, a cloud-edge based federated learning framework, where a shared global model is initially learned from multiple network participants. Individual personalization is achieved using a generative convolutional autoencoder (GCAE), aiming to generate a class-balanced dataset tailored to individual client’s data. FedHome exhibits a notable improvement of over 10% in accuracy compared to a conventional global federated learning model.

A recent study on remote patient monitoring (RPM) introduced FedStack architecture, a personalized federated learning approach⁷⁹. The study was based on the MHEALTH dataset with 10 patients^82,83. A total of 21 features are extracted from three sensor data types (accelerometry, gyroscope and magnetometer) to classify 12 different natural activities. Three different model architectures (ANN, CNN, Bi-LSTM) are used in the study. FedStack architecture achieves personalization by aggregating heterogenous architectural models at the client level and demonstrates that the FedStack approach consistently outperforms both local and global models.

A recent work in personalized federated learning focused on in-hospital mortality prediction⁶². The study utilized a publicly available electronic health records (HER) database, comprising over 200,000 patients across 208 hospitals in the US. Features were extracted from patients’ HER and employed for a binary classification problem, with a multi-layer perceptron (MLP) adopted for modeling. The proposed POLA method involves the initial training of a global federated learning model, referred to as the teacher model. In the subsequent step, local adaptation is accomplished through a Genetic Algorithm (GA) approach. Comparative results highlight the superior performance of the POLA approach against the traditional FedAvg⁸⁴ and two other state-of-the-art personalized federated learning architectures^85,86.

For CGM based glucose prediction in T1D populations, a recent paper proposes an asynchronous and decentralized federated learning approach where future glucose trajectories are predicted from historical CGM values⁸⁷. The proposed work also signifies a personalized model where individual patients can benefit and is tested with multiple publicly available datasets for adult T1D patients. Another recent approach focused on a privacy-preserving glycemic management for T1D use a federated reinforcement learning framework where the goal was to maximize the time-in-range (TIR) and reduce glycemic risk scores⁸⁸. The study was conducted on data generated from 30 individuals using a UVA-Padova T1D simulator. Federated learning holds great potential in overcoming the challenges of privacy and delivering personalized predictions that can benefit individuals with T1D.

Methods and materials

This section describes the datasets used in this study, data processing steps and the experimental setup. Further, we describe the prediction model, and the different frameworks used in this study.

Clinical datasets and preprocessing

Ohio T1DM

The Ohio T1DM⁸⁹ dataset was publicly released in two batches (2016 and 2018) with a total of 12 participants. During the 8-week study period, all 12 participants wore a Medtronic Enlite CGM for collecting glucose readings, a Medtronic 530G/630G insulin pump and an Empatica/Basis sensor for collecting physiological data collection. For our analysis, we only focus on the CGM data. The patient data was collected in free-living conditions.

TCH study

This is a proprietary data collected at Texas Children’s Hospital, Houston, TX. from a total of 113 T1D patients using Dexcom CGM devices. The data for each patient spans a period of 30–90 days. Additional information about this dataset is available in our previous publications²⁵. Similar to the OhioT1DM dataset, the data for all patients was collected under free-living conditions.

Comprehensive details about these two datasets are provided in Table 1. Figure. 1 illustrates glycemic excursions across all patients, specifically hypoglycemia and hyperglycemia profiles. The dotted grey lines on the x-axis and y-axis represent the median hypoglycemia and hyperglycemia percentages respectively. It is evident that the prevalence of hyperglycemia ($\:\stackrel{-}{x}=41.7\%and\:x\cong\:22.7\%$) is considerably higher compared to hypoglycemia ($\:\stackrel{-}{x}=2.3\%and\:x\cong\:1.7\%$).

Data preprocessing

Based on the available literature⁹⁰, data preprocessing involves two main steps: first, replacing ‘Low’ and ‘High’ glucose readings with 40 mg/dL and 400 mg/dL, respectively. Second, imputing missing glucose readings through linear interpolation when less than six consecutive readings are missing. This threshold is chosen based on the literature available on autocorrelation between glucose readings^27,91. After interpolating missing values, consecutive sequences of glucose readings in the last two hours (24 readings) are taken as individual samples. This time-window is selected based on our previous works for glucose prediction¹². Any samples with missing values at this stage are excluded from our analysis. All preprocessing steps have been taken following standard practices in the literature for analyzing CGM data⁹². For our analysis, we compare three different model types (Figure. 3), frameworks based on the data used for training and the training process:

Table 1 Datasets used in the study.

Full size table

HH loss

Our custom loss function is derived on the principle of Taguchi loss function^93,94 and is based out of two primary needs: (a) enhancing penalties for errors in glycemic excursion regions, and (b) simultaneously balancing these penalties considering the uneven distribution of samples in the hypoglycemia, normal and hyperglycemia glucose ranges. For the first goal, we add a polynomial increasing penalty for glucose readings further they deviate from the normal range. To achieve the second objective i.e., to account for the disproportionate sample distribution in the hypoglycemia, hyperglycemia, and normal glucose ranges, we introduce a tuning parameter $\:\alpha\:$. This parameter ensures that while reducing the errors in the glycemic excursion regions, the predictive performance for overall glucose readings is not compromised.

$$\:HH\:Loss=\{\:\begin{array}{c}SE\:,\:{Y}_{i}\ge\:70\:and\:{Y}_{i}\le\:180\\\:SE+\alpha\:*penalty,\:{Y}_{i}<70\:\\\:SE+\left(1-\alpha\:\right)*penalty,\:{Y}_{i}>180\end{array}$$

(1)

$$\begin{array}{c}\:Squared\:Error\:\left(SE\right)={\left({Y}_{i}-\:\widehat{{Y}_{i}}\right)}^{2},\\ \:penalty=\left|\:{Y}_{i}-\:\widehat{{Y}_{i}}\right|*{\left({Y}_{i}-c\right)}^{2},\\ \:\alpha\:\:\in\:\left(\text{0,1}\right)\\ \:c=125\:\left(midpoint\:of\:glycemic\:excursion\:boundaries\right)\end{array}$$

.

As outlined in Eq. (1), a penalty is applied, equivalent to the square of the distance of the glucose value and the mid-point of the glycemic excursion boundary, for values falling within the hypoglycemia and hyperglycemia ranges. We also provide a visual institution behind the HH loss function in Figure. 2. There is no additional penalty for errors in the normoglycemia but for different values of $\:\alpha\:$ there is a significant additional penalty in the hypoglycemia and hyperglycemia regions forcing the model to perform better in these critical excursion regions. The value of $\:\alpha\:$ is tunable to adjust for varying data distributions. For a more detailed derivation, please refer Appendix.

Federated learning

Federated Learning is a distributed learning framework that enables the training of machine learning models without transmitting sensitive user data to a central server. It relies on collaborative learning between participating users, where each user trains a shared model locally on its own training data through multiple rounds of optimization. Only the model characteristics such as model parameters, weights, gradients etc. are shared among users in the network. The sharing process can occur directly among the participating users or through a central server depending on the FL network topology. After receiving and aggregating the shared model characteristics from all users (e.g.: simple average, weighted averaging), the aggregated information is relayed back to users for further optimization by participants on their individual local data. This iterative communication process continues through multiple rounds until a stopping criterion (e.g.: convergence) is met. All user data remains stored locally, and only specific model characteristics are shared during this collaborative learning process. This approach reduces network and communication costs as model parameters are much smaller in size than actual training data. Additionally, it provides increased accessibility and comparable accuracy to conventional machine learning models. In addition to this, some level of privacy guarantees is ensured. The decentralized nature of federated learning allows patients and clinicians to benefit from more accurate and reliable models trained across a more extensive and diverse data pool.

A general formulation of the federated learning can be expressed as follows:

$$\begin{array}{c}\:\underset{\phi\:}{\text{min}}L\:\left({X}_{i};\phi\:\right),\:\\ \:L=global\:loss\:function\\ \:K=local\:losses\:{{\left\{{L}_{k}\right\}}^{K}}_{k=1}\\ \:{X}_{k}=private\:glucose\:data\end{array}$$

Glucose prediction specific problem formulation

In general, input for a data-driven algorithm for predicting future glucose levels primarily consists of historical glucose readings ($\:g$) and other features (physiological data, insulin, and carbohydrates intake etc.) if available. In our case where, we predict future glucose readings based on historical glucose readings observed. More specifically, we will learn patterns based on glucose readings observed in the past two hours to predict glucose readings 30 minutes in the future. This is so that individuals can take interventions to prevent glycemic excursions. Figure. 3 provides a graphical representation of the different learning paradigms that we use in our analysis.

Local model

Data is stored locally with the patient and not shared across other entities, such as a central server or other patients in the network. While this approach ensures full data privacy, the training data for training the model is limited to the individual patient and risks suboptimal performance. As shown in figure. 3, for local models, all the data storage and model training occurs locally and a central server has no role to play.

Central model

Unlike with the local model, in this setting, patient data from all individuals are shared at a central server, where a single joint model is trained on data from all individuals in the network. While the central model benefits from a large data pool of data for training, it has minimal privacy for individual patients, as their entire patient data corpus is shared with a central server. As shown (Figure. 3), all data and model training occur at the central server where individual patient data has been stored.

Federated (global) model

Federated model (or sometimes also called global model) combines advantages of both the local and central models by bringing together robust prediction capabilities of the central model along with the data privacy aspects of the local models. In this approach, data storage and model training occur locally on the patient’s device. During the training process, optimization of parameters locally for each individual patient over their own data and model weights/parameters are shared frequently with a central server for aggregation. These aggregated weights are then returned to individual patients and serve as initializers for further training. Actual data is never shared with other patients in the network or with a global server, maintaining privacy. This is known as model development in a federated-learning framework⁸⁴. Models in this framework achieve better prediction performance through shared learning with other patients while preserving privacy. However, global models may not always be the most suitable for every entity (here, patient) in the network. To address this, we extend the global model via a fine-tuning step, by personalizing the global model for each individual patient with the custom HH loss function to achieve personalization for individual patients in the network.

FedGlu model

FedGlu is an extension of the federated model that personalizes model for individual patients. A simple workflow is presented in Figure. 4. The holistic mechanism of the FedGlu algorithm is presented in Figure. 3 where the central server initializes the global model with random weights and broadcasts it to each participating patient in the network. Each patient trains this globally shared model on its own data using the MSE loss function and once trained for a fixed number of epochs, send the trained weights to the central server for aggregation. The central server will receive updated model weights from each of the patients where the global model was trained using a patient’s own individual local data. This is one communication round within the federated learning step. The server aggregates model weights received from all patients, checks if convergence is reached and broadcasts the updated global model back to the patients for further training. This process continues until global model achieves convergence after which the central server sends the final global model to all the participating patients in the network. Once the patient has received the final global model, patients will fine-tune this global model one more time but this time using the HH loss function. This fine-tuning step is asynchronous and helps achieve the desired personalization leveraging data for training robust models across all patients as well as adjusting for unique local data distributions.

Model architecture

A multilayer perceptron model is used for analysis in this study. The input layer consists of glucose readings observed in the last 2 h. This input with dimensions $\:24\:x\:1$ is fed into a dense layer of 512 neurons with a rectified linear unit (ReLU) activation function. The output feature map is passed through two hidden layers with 256 neurons and outputs a 64-neuron layer which, both activated with a ReLU activation. The final output layers get their input from the 3rd dense layer in the network and provide a single number prediction of the future glucose reading.

Validation approach

In line with our prior research^12,14, we implement a 5-fold validation strategy with temporally partitioned splits. This strategy guarantees that training and testing splits are derived from non-overlapping time-windows, mitigating potential biases from temporal correlations that could lead to overly optimistic results. This methodology aligns with the BGLP Challenge, where the initial few days of data are utilized for training and the subsequent days constitute the hold-out test set. In our work, we extend the same across multiple splits to make it more robust.

Evaluation metrics

We illustrate the efficiency of our approach using standard metrics in glucose prediction: root mean squared error (RMSE) along with Clark’s Error Grid (Figure. 5)⁹⁵ to evaluate clinical significance of the predictions. RMSE is defined as:

$$\begin{array}{l}\:RMSE=\:\sqrt{\frac{\sum_{i=1}^{N}{\left({y}_{i}-\:\widehat{{y}_{i}}\right)}^{2}}{N},}\\ \:N=number\:of\:data\:points\\ \:{y}_{i}=true\:value\\ \:\widehat{{y}_{i}}=predicted\:value\end{array}$$

Clarke’s Error Grid Analysis (CEGA) is described in Figure. 5 and the definitions of each zone are provided in Table 2.

Table 2 Clarke’s Error Grid Analysis (CEGA) zone-wise description.

Full size table

Model training/hyperparameter tuning

To provide robust estimates of our methodology and support the enlisted contributions with results, we consider the following experiments:

1.
Setup 1 – Advantage of HH loss function: The performance with the HH loss function is compared against the standard mean squared error (MSE) loss. We demonstrate the advantage in terms of standard regression metrics and the clinical significance for glucose predictions. We compare the HH loss function results across two separate datasets namely TCH study data and the OhioT1DM dataset. We use a central (population-level) model to obtain the results.
2.
Setup 2 – Usefulness of Personalized Federated Models: We compare the performance across the different model types: central model, local model, global federated model, and the personalized federated model. The comparison is made independently across both datasets. This setup will prove the robustness of ML models trained in a federated learning work against the central and local models.

Evaluation metrics

All models at central, local, and federated (global and personal) levels shared identical architecture and parameters. The learning rate was fixed at 0.001 with a constant batch-size of 500. The training was set to a maximum of 50 epochs, with early stopping for a patience of 10 epochs to avoid prevent redundant training and thereby reduce computational time. The global federated model is trained with TensorFlow-federated. The client-optimizer was set to ‘Adam’ with a learning rate of 0.001 and the serve-optimizer to ‘SGD’ with a learning rate of 1 to mimic the baseline federated model⁸⁴. The number of communication rounds was set to 50, and a simple weighted-average proportional to the number of samples with each patient (node) in the network was used as the aggregating function. The global federated model is saved after observing the train loss (MSE) convergence. Individual patients (nodes) then fine-tune this saved global federated model on their individual local data with an ‘Adam’ optimizer and a learning rate of 0.001. This fine-tuning step is however done the custom HH loss function. For comparing results, the local-level and (personalized) federated-level models, with a specific $\:\alpha\:$, is selected based on the training data where the combined RMSE (across hypoglycemia and hyperglycemia) is minimum. At the central level, a single $\:\alpha\:$ which showed the maximum combined reduction across all patients, was considered.

Results

This section describes the datasets used in this study, data processing steps and the experimental setup. Further, we describe the prediction model and the different frameworks used in this study.

Advantage of HH loss function

In the first analysis, we assess how the HH loss function enhances prediction performance in the glycemic excursion regions while maintaining clinical significance for overall predictions based on two independent datasets. For this we consider a centralized (population-level) model development setting. Since a common model is trained across all the patients for the dataset, we choose a single$\:{\prime\:}\alpha\:{\prime\:}$ value ($\:\alpha\:=1$) which showed the combined maximum improvement (reduction in RMSE) across hypoglycemia and hyperglycemia ranges on the training dataset.

TCH study dataset

Figure. 6 (left), gives a performance comparison between the HH loss and the baseline mean squared error (MSE) loss. The HH loss function exhibits a 52% reduction in root-mean-squared error (RMSE) for the hypoglycemia region compared to MSE, while showing similar performance in the hyperglycemia region. Although there is a slight dip in overall prediction performance (regarding RMSE values), the clinical impact evaluated through Clark’s EGA is negligible and on the contrary better with HH loss than MSE.

In our evaluation using Clark’s Error Grid (Table 3) to gauge the clinical relevance of the predictions, the HH loss exhibited superior performance, missing to detect an average of only 0.51% of excursions, compared to 2.07% with MSE - representing a substantial 75% reduction. This refers to Zone: D + E from CEGA, and predictions falling in this region are analogous to false negatives from the confusion matrix used to detect hypoglycemia and hyperglycemia. Excursion detection is critically important as it allows patients to take intervention measures based on the prediction. Additionally, overall glucose predictions within the combined Region A + B improved from 97.94% to 99.39%. These findings underscore that, while the HH loss introduces a bias (increasing predictions in Zone: C) towards enhancing accuracy in excursion regions (hypoglycemia and hyperglycemia), it does not detrimentally affect the clinical accuracy of overall glucose predictions.

Table 3 Summary of clarke’s error grid analysis (CEGA) performance for different loss functions.

Full size table

Ohio T1DM dataset

For the OhioT1DM dataset, Figure. 6 (right) a similar performance comparison, the HH loss exhibits an RMSE of 15.12 for the hypoglycemia region, which is 41% less than the MSE, whereas in the hyperglycemia region the RMSE remains relatively consistent. Similar to the performance observed with the TCH study data, there is a decline in overall prediction performance in terms of RMSE values. However, this performance drop is trivial when assessed for the clinical significance with CEGA. The CEGA provides evidence for this claim. The number of data points in the regions: D + E from reduces from 2.05% with MSE to 0.32% (84% reduction) with the HH loss – indicating an increase in the model’s ability to predict excursions. Concurrently, the percentage of data points in the regions: A + B increases from 97.89% to 99.59%, suggesting an improvement in overall predictions without compromise.

Model training in a federated learning framework

The predictive capabilities of models incorporating the HH loss function were extensively evaluated at the local, central, and federated levels. In Figure. 7, the magnitude of difference (measured in RMSE) in various glycemic excursion regions is presented. Table 4 compares different model types through Clarke’s Error grid analysis, while Table 5 outlines improvements in terms of the number of patients for RMSE and CEGA. Compared to local models, federated models exhibit an improvement of 16.67% (in RMSE) for predicting hypoglycemia and 18.91% (in RMSE) for predicting hyperglycemia simultaneous – both statistically significant using the paired t-test ($\:p\ll\:0.01$). For the TCH study data, federated models demonstrated improvements for 96 (out of 113) patients in hypoglycemia region and 110 (out of 111) patients in hyperglycemia regions over local models simultaneously. Regarding the OhioT1DM dataset, federated models improved by 9% (in RMSE) for predicting hypoglycemia and 33.29% (in RMSE) for predicting hyperglycemia. Although the improvement in hyperglycemia is statistically significant ($\:p\ll\:0.01$) but hypoglycemia ($\:p=0.18$) is not. For the OhioT1DM dataset, federated models improve for 9 (out of 12) patients in hypoglycemia region and 12/12 patients in hyperglycemia region, over the local models. Across the two datasets, there is an improvement of 12.37% (in RMSE) for predicting hypoglycemia and 29.05% (in RMSE) for prediction hyperglycemia – which are both statistically significant with ($\:p\ll\:0.01$).

When evaluated for clinical significance through Clark’s Error Grid, it is evident that HH loss function improves glycemic excursions (Region: D + E) prediction at all levels without compromising the clinical significance of overall predictions (Region: A + B) across both datasets. Comparing local and federated models, for the TCH study data, we see a 37% reduction ($\:p\ll\:0.01$) and for the OhioT1DM dataset, 31% reduction ($\:p\ll\:0.01$) in points falling in the Region: D + E (compared to MSE) which signifies an increase in the detection capability of glycemic excursions. When comparing local and federated models for predictions falling in the Region: C of CEGA, for the TCH study data, there is a 50% reduction whereas for the OhioT1DM dataset, there is a 20% reduction in which signifies fewer false predictions with the federated models. Table 4 also provides evidence that federated models when compared to local models through CEGA, are able to improve over local models for all regions of CEGA for almost all patients across the two datasets.

On the other hand, central models distinctly outperform federated models for both hypoglycemia and hyperglycemia due to their advantage of a significantly larger training pool. This is evident in the evaluation with RMSE and CEGA. However, compared to the local models, federated models achieve a much closer performance to central models. Table 5 outlines the improvement with federated models against central and local models for each of TCH and OhioT1DM datasets.

Table 4 Summary of clarke’s error grid analysis (CEGA) performance for different model types.

Full size table

Table 5 Federated model improvements over (a) local and (b) central models.

Full size table

Discussion

Based on the results, the main contributions of our work can be summarized as:

1.
Introduction of a novel HH loss function aimed at improving glucose predictions in the excursion regions while ensuring clinically significance.
2.
Model implementation with the HH loss function within a federated learning framework, balancing performance in the excursion region and privacy.

Insights and observations

Models with HH loss function can predict more accurately and clinically significantly in glycemic excursion regions at local, central, and federated levels. This performance may result in sub-optimal RMSE values in the overall glucose ranges. However, when compared for clinical significance, there is no reduction and on the contrary, it improves overall predictions in regions: A + B. A detailed evaluation with CEGA with different model types, including local, central, and federated models is presented in Table 6. A further expansion of CEGA zones for Table 6 is provided in Appendix 1.

In the proposed HH loss function, the parameter ‘$\:{\alpha\:}^{{\prime\:}}$ plays a vital role in penalizing the errors and achieving balance across the three different glucose regions. This can be optimally chosen for the local and federated models as model training is done locally. However, in the case of the central model, we can choose a single value for the entire cohort of patients. This may be beneficial to the majority of patients but not all the patients. We further explore the impact of choosing different ‘$\:{\alpha\:}^{{\prime\:}}$ values for the central model.

Also, when we compare model performance across the different model types, we see that federated model has varying level of differences compared to local and central models. We try to investigate further these improvements concerning the glycemic profiles of patients and who is likely to benefit the most.

Table 6 Summary of clarke’s error grid across different (a) model types and (b) loss functions.

Full size table

Impact of ‘$\:\varvec{\alpha\:}$’ parameter

The parameter ‘$\:{\alpha\:}^{{\prime\:}}$ can be tuned to achieve a well-balanced optimal prediction performance for hypoglycemia and hyperglycemia excursions regions simultaneously. A higher $\:{{\prime\:}\alpha\:}^{{\prime\:}}$ value prioritizes performance in the hypoglycemia regions whereas a lower $\:{\prime\:}{\alpha\:}^{{\prime\:}}$ emphasizes performance in the hyperglycemia region. Figure. 8 illustrates the average improvement (in RMSE) across patients in both glycemic excursion regions occurring co-occurring for various $\:{\prime\:}{\alpha\:}^{{\prime\:}}$ values. The red curve, representing improvement in hypoglycemia values, shows an exponential increase compared to MSE as $\:{\prime\:}\alpha\:{\prime\:}$ increases. In contrast, the blue curve indicating improvement in hyperglycemia values, exhibits a downward trend with increasing $\:{{\prime\:}\alpha\:}^{{\prime\:}}$. However, the slope for hypoglycemia values is significantly steeper than hyperglycemia values. This is because of the high data imbalance that exists, especially for hypoglycemia values (median: 1.64%) as compared to hyperglycemia values (median: 43%). This also highlights the greater of accurately predicting hypoglycemia values compared to hyperglycemia values. The parameter $\:{{\prime\:}\alpha\:}^{{\prime\:}}$ can be customized for local and federated models based on individual preferences and glycemic profiles to yield optimal results. Table 7 provides performance metrics (for central models) in terms of Clark’s EGA for different ‘$\:{\alpha\:}^{{\prime\:}}$ values.

Table 7 Central model performance with different values of ‘$\:\alpha\:$’ parameter (TCH study + Ohio T1DM data).

Full size table

Table 8 Central model performance with different values of ‘$\:\alpha\:$’ parameter (TCH study + Ohio T1DM data).

Full size table

Table 9 Central model performance with different values of ‘$\:\alpha\:$’ parameter (TCH study + Ohio T1DM data).

Full size table

Federated models: who benefits the most?

We compare the performance improvements achieved with federated models against central and local models (with HH loss) across varying glycemic profiles of both hypoglycemia and hyperglycemia regions. To categorize patients effectively, they are initially binned into groups of ten based on the percentage of hypoglycemia or hyperglycemia values in their profile. For hypoglycemia, these intervals include (0, 0.22]%, (0.22, 0.5]%, and so forth, while for hyperglycemia, intervals are defined as (0, 7.81]%, (7.81, 19.31]%, and so on.

In the context of hypoglycemia prediction (Figure. 9), a notable trend emerges as the percentage of hypoglycemia values increases, the prediction performance of federated models, global models and local models converges. Patients with higher hypoglycemia values exhibit similar prediction performance across local, central, and federated models. However, for patients with extremely low instances of hypoglycemia, central models far outperform local and federated models. This is because training data for central models has multiple instances of hypoglycemia for the model to train whereas local and federated models do not have that advantage. The Spearman correlation coefficient ($\:\rho\:$) between mean improvement (in RMSE) with federated models and increasing hypoglycemia (interval) values, is statistically significant against both local ($\:\rho\:=0.01$) and central models ($\:\rho\:=0.02$). When compared for variance, it shows similar trend and is statistically significant across local ($\:\rho\:=0.01$) and central ($\:\rho\:=0.01$) models. It is also observed that, predictions performance with federated models is closer to the central models and also maintain a clear advantage over the local models for close to 50% of the patients.

In contrast, for hyperglycemia prediction (Figure. 9), no discernible patterns emerge across the different model types (federated model, global model, and local model) when compared to the hyperglycemia profile of patients. The Spearman correlation coefficient ($\:\rho\:$) for improvements with federated models over local ($\:\rho\:=0.7$) and central models ($\:\rho\:=0.1$) is not statistically significant. Furthermore, the performance disparity across these three model types (relative to hypoglycemia) is minimal for hyperglycemia prediction. A major contributing factor is a substantial difference in the number of hypoglycemia glucose values compared to hyperglycemia glucose values (the lowest bin for hyperglycemia is (0, 7.81]% and a median of 1.64%. In contrast, the highest bin for hypoglycemia is (4,63, 12.13]% and a median of 43%).

Beyond the presence of hypoglycemia and hyperglycemia samples in the data, we evaluated the impact of different individual profile factors including gender, duration of diabetes and HbA1c values on the model performance. However, none of these factors showed any visual or statistical difference in performance between the central and federated models. This further corroborates the glycemic profiles of individuals representing the distribution of hypoglycemia and hyperglycemia samples within a patient has the biggest impact on the performance of different model types.

Comparison with literature

We comprehensively compared between the HH loss function and a recent state-of-the-art NMSE (normalized mean squared error) that showed significant performance improvement in the hypoglycemia range. The comparison is made on standard evaluation metrics of RMSE and Clark’s EGA. Analyzing the TCH study data, HH loss improves performance against NMSE (Figure. 10 - left) by 29%. ($\:p\ll\:0.01$) in the hypoglycemia region and by 5% ($\:p\ll\:0.01$) in the hyperglycemia region. Similarly, for the OhioT1DM dataset (Figure. 10 - right), HH loss improves performance (in RMSE) in the hypoglycemia region by 21%. ($\:p\ll\:0.01$) and hyperglycemia region by 6% ($\:p\ll\:0.01$). Moreover, when compared using Clark’s EGA, models with the HH loss function fails to detect only 0.51% of glycemic excursions compared to 1.47% with NMSE (a 65% reduction). We also see an increase in points falling in Regions: A + B with the HH loss function compared to using either of MSE or NMSE (Table 8).

The OhioT1DM dataset, used in the BGLP challenge, is a standard dataset for comparing glucose prediction performance. While the BGLP challenge and other literature aim to show accurate predictions of overall glucose values, our work focuses explicitly on improved performance in the glycemic excursion regions (hypoglycemia and hyperglycemia), making comparisons challenging. Nevertheless, we attempt to show our work in the light of other notable studies based on CEGA, providing a standard and fair way to assess clinical significance.

Table 9 compares research achieving state-of-the-art performances using the OhioT1DM dataset. Remarkably, our simple MLP model, with several hundred times lower number of training parameters, outperforms or matches the top-performing models reported. Additionally, federated models, lacking access to the entire training data, show promising results compared to other state-of-the-art methods proposed in the literature.

Clinical applications

T1D requires rigorous management as hypoglycemia can lead to instantaneous life-threatening consequences while chronic hyperglycemia can result in a hybrid diabetes (both T1D and T2D), severely complicating diabetes management. Type 1 diabetes care prioritizes preventing impending hypoglycemia events while limiting chronic hyperglycemia. yet many individuals still miss > 70% time-in-range recommended targets. CGM provides short-horizon glucose predictions which are critical for diabetes care. Our framework is designed for future clinical integration in a way that preserves privacy and fits existing data flows. In a multi-site setting, FL models have demonstrated success at hospital scale in the EXAM study across 20 institutions, which improved external generalization without pooling protected data¹¹⁴. Multiple studies have suggested the utility of a federated learning framework by showing its technical compatibility with existing IT and clinical operations¹¹⁵. Specifically for diabetes workflows, CGM data are already integrated within EHRs where providers can access patient’s historical CGM trends¹¹⁶. In such settings, the use of model parameters over centralizing protected health information is of paramount importance. Individual patient participation in periodic federated rounds can be managed within current governance structures (model versioning, convergence checks), and outputs can be written to the EHR through standard interfaces. The tunable ‘$\:\alpha\:$’ parameter allows individual-level personalized priorities and glycemic targets to be achieved in practice. This positioning enables future integration without altering existing devices or centralizing patient data.

Limitations and future work

We introduce a bias towards glycemic excursion regions to enhance performance in the hypoglycemia and hyperglycemia regions. As a result, the prediction error for overall glucose does increase. However, we overcome this limitation by ensuring that clinical significance is not compromised. Additionally, when using HH loss, the federated models undergo a two-step construction process: (i) initial training with MSE loss function and (ii) subsequent fine-tuning with the HH loss at local levels. In future endeavors, we aim to implement a single step personalized federated model combining these two phases. This work uses the baseline federated learning framework to develop the proposed algorithm, FedGlu. There can be many challenges for its adoption in the real-world, like the unavailability of one or more entities in the network, a corrupt local entity or a corrupt central server which may compromise the privacy of the framework. However, this work aimed to lay the groundwork for using federated learning in glucose prediction algorithms that can benefit patients. The authors plan to improve the proposed FedGlu by adding privacy guarantees in subsequent works and also tackle the challenge of unavailability among participating entities for improved learning.

Conclusion

In this work, we proposed a novel HH loss function that simultaneously improves predictions in both hypoglycemia and hyperglycemia regions without compromising overall predictions. This is critical as the HH loss function improves the glycemic excursion detection capabilities by an average of 78% across two datasets (a total of 125 patients) as compared to the standard MSE loss. The results were consistent with a proprietary dataset and publicly available OhioT1DM datasets. We also demonstrated the consistency of the HH loss function via FedGlu, a machine learning through a collaborative learning approach. FedGlu clearly outperforms the local models (35% improved glycemic excursion detection capabilities) while coming close to central model performances. These results prove the need to develop machine learnings models with strong predictive capabilities and ensure privacy for sensitive patient healthcare data.

Data availability

The TCH study data used and/or analyzed in the current study is available from the corresponding author on reasonable request.

References

Sun, H. et al. IDF diabetes atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res. Clin. Pract. 183, 109119 (Jan 2022).
Battelino, T. et al. Clinical targets for continuous glucose monitoring data interpretation: recommendations from the international consensus on time in range. Diabetes Care. 42, 8 (Aug 2019).
Beck, R. W. et al. Validation of time in range as an outcome measure for diabetes clinical trials. Diabetes Care. 42 (Mar), 3 (2019).
Google Scholar
Group, D. R. Epidemiology of severe hypoglycemia in the diabetes control and complications trial. Am. J. Med. 90, 450–459 (1991).
Article Google Scholar
Lund, S. S. et al. Effect of adjunct Metformin treatment in patients with type-1 diabetes and persistent inadequate glycaemic control. A randomized study. PLoS One, 3, e3363 (2008).
Moon, R. J., Bascombe, L. A. & Holt, R. I. The addition of Metformin in type 1 diabetes improves insulin sensitivity, diabetic control, body composition and patient well-being. Diabetes Obes. Metab. 9 (Jan), 1 (2007).
Google Scholar
Larme, A. C. & Pugh, J. A. Attitudes of primary care providers toward diabetes: barriers to guideline implementation. Diabetes Care. 21 (Sep), 9 (1998).
Google Scholar
Atkin, S. H., Dasmahapatra, A., Jaker, M. A., Chorost, M. I. & Reddy Fingerstick glucose determination in shock. Ann. Intern. Med. 114 (Jun 15), 12 (1991).
Google Scholar
Pfützner, A. Advances in patient self-monitoring of blood glucose. J. Diabetes Sci. Technol. 10, 101–103 (2016).
Article Google Scholar
Li, J. et al. Non-invasive monitoring of three glucose ranges based on ECG by using DBSCAN-CNN. IEEE J. Biomed. Health Inf. 25 (Sep), 9 (2021).
Google Scholar
Mariani, H. S., Layden, B. T. & Aleppo, G. Continuous glucose monitoring: A perspective on its Past, Present, and future applications for diabetes management. Clin. Diabetes. 35 (Jan), 1 (2017).
Google Scholar
Dave, D. et al. Improved Low-Glucose predictive alerts based on sustained hypoglycemia: model development and validation study. JMIR Diabet. 6, e26909 (Apr, 2021).
Rodbard, D. Continuous glucose monitoring: A review of recent studies demonstrating improved glycemic outcomes. Diabetes Technol. Ther. 19, S25–S37 (Jun 2017).
Dave, D. et al. Detection of hypoglycemia and hyperglycemia using noninvasive wearable sensors: ECG and accelerometry. J Diabetes Sci Technol. 18 (2), 351–362 (2024).
Gabbay, M. A. L. et al. P. C. Time in range: a new parameter to evaluate blood glucose control in patients with diabetes. Diabetol. Metab. Syndr. 12, 1 (2020).
Article Google Scholar
Ling, S. H., San, P. P., Lam, H. K. & Nguyen, H. T. Hypoglycemia detection: multiple regression-based combinational neural logic approach. Soft. Comput. 21, 2 (Jan, 2017).
Ling, S. H., San, P. P. & Nguyen, H. T. Non-invasive hypoglycemia monitoring system using extreme learning machine for type 1 diabetes. ISA Trans. 64, 440–446 (Sep, 2016).
Ling, S. S. & Nguyen, H. T. Genetic-algorithm-based multiple regression with fuzzy inference system for detection of nocturnal hypoglycemic episodes. IEEE Trans. Inf. Technol. Biomed. 15 (2), 308–315 (Mar, 2011).
Lipponen, J. A. et al. Hypoglycemia Detection Based on Cardiac Repolarization Features (IEEE, 2011).
Cohen, I. G. & Mello, M. M. HIPAA and Protecting Health Information in the 21st Century. JAMA, 320, 3Jul 17 231–232. (2018).
McCall, B. What does the GDPR mean for the medical community? Lancet 391, 1249–1250 (2018).
Article PubMed Google Scholar
Vayena, E., Blasimme, A. & Biomedical Big Data. New models of control over Access, use and governance. J. Bioeth. Inq. 14 (Dec), 4 (2017).
Google Scholar
Vayena, E., Blasimme, A. & Cohen, I. G. Machine learning in medicine: addressing ethical challenges. PLoS Med., 15, (11), e1002689 (Nov, 2018).
Bremer, T. & Gough, D. A. Is blood glucose predictable from previous values? A solicitation for data. Diabetes 48 (3), 445–451 (1999).
Article CAS PubMed Google Scholar
Dave, D. et al. Feature-Based machine learning model for Real-Time hypoglycemia prediction. J. Diabetes Sci. Technol. 15 (Jul), 4 (2021).
Google Scholar
Georga, E. I. et al. D. I. Multivariate prediction of subcutaneous glucose concentration in type 1 diabetes patients based on support vector regression. IEEE J. Biomed. Health Inf. 17 (Jan), 1 (2013).
Google Scholar
Georga, E. I., Protopappas, V. C., Ardigo, D., Polyzos, D. & Fotiadis D. I. A glucose model based on support vector regression for the prediction of hypoglycemic events under free-living conditions. Diabetes Technol. Ther. 15 (Aug), 8 (2013).
Google Scholar
Tabaei, B. P. & Herman, W. H. A multivariate logistic regression equation to screen for diabetes: development and validation. Diabetes Care. 25 (Nov), 1999–2003 (2002).
Article PubMed Google Scholar
ElMoaqet, H., Tilbury, D. M. & Ramachandran, S. K. Multi-Step ahead predictions for critical levels in physiological time series. IEEE Trans. Cybern. 46 (Jul), 7 (2016).
Google Scholar
Yang, J., Li, L., Shi, Y. & Xie, X. An ARIMA model with adaptive orders for predicting blood glucose concentrations andhypoglycemia. IEEE J. Biomed. Health Inf. 23 (3), 1251–1260 (May, 2019).
Alfian, G. et al. Blood glucose prediction model for type 1 diabetes based on artificial neural network with time-domain features. Biocybernetics Biomedical Eng. 40 (4), 1586–1599 (2020). Oct-Dec.
Article Google Scholar
Perez-Gandia, C. et al. Artificial neural network algorithm for online glucose prediction from continuous glucose monitoring. Diabetes Technol. Ther. 12 (Jan), 1 (2010).
Google Scholar
Gu, K., Dang, R. & Prioleau, T. Neural Physiological Model: A Simple Module for Blood Glucose Prediction (IEEE, 2020).
Li, K. et al. A deep learning framework for accurate glucose forecasting. IEEE J. Biomed. Health Inf. 24 (Feb), 2 (2020).
Google Scholar
Dong, Y., Wen, R., Li, Z., Zhang, K. & Zhang, L. Clu-RNN: A New RNN Based Approach To Diabetic Blood Glucose Prediction (IEEE, 2019).
Li, K., Daniels, J., Liu, C., Herrero, P. & Georgiou, P. Convolutional recurrent neural networks for glucose prediction. IEEE J. Biomed. Health Inf. 24 (Feb), 2 (2020).
Google Scholar
Aliberti, A. et al. A Multi-Patient Data-Driven approach to blood of glucose prediction. Ieee Access. 7, 69311–69325 (2019).
Article Google Scholar
El Idrissi, T. & Idri, A. Deep Learning for Blood Glucose Prediction: Cnn Vs Lstm (Springer, 2020).
Tena, F., Garnica, O., Davila, J. L. & Hidalgo, J. I. Aug 1. An LSTM-based Neural Network Wearable System for Blood Glucose Prediction in People with Diabetes. IEEE J Biomed Health Inform, PP (2023).
Chen, J., Li, K., Herrero, P., Zhu, T. & Georgiou, P. Dilated Recurrent Neural Network for Short-time Prediction of Glucose Concentration. City, (2018).
Zhu, T., Li, K., Chen, J., Herrero, P. & Georgiou, P. Dilated recurrent neural networks for glucose forecasting in type 1 diabetes. J. Healthc. Inf. Res. 4 (Sep), 3 (2020).
Google Scholar
Jaloli, M. & Cescon, M. Long-Term prediction of blood glucose levels in type 1 diabetes using a CNN-LSTM-Based deep neural network. J. Diabetes Sci. Technol. 17 (Nov), 6 (2023).
Google Scholar
Lu, X. & Song, R. A Hybrid Deep Learning Model for the Blood Glucose Prediction (IEEE, 2022).
Song, W., Cai, W., Li, J., Jiang, F. & He, S. Predicting Blood Glucose Levels with EMD and LSTM Based CGM Data (IEEE, 2019).
Shuvo, M. M. H. & Islam, S. K. Deep multitask learning by stacked long Short-Term memory for predicting personalized blood glucose concentration. IEEE J. Biomed. Health Inf. PP, 3 (Jan 2), 1612–1623 (2023).
Article Google Scholar
Lee, S. M., Kim, D. Y. & Woo, J. Glucose transformer: forecasting glucose level and events of hyperglycemia and hypoglycemia. IEEE J. Biomed. Health Inf. PP, 3 (Feb 2), 1600–1611 (2023).
Article Google Scholar
Daniels, J., Herrero, P. & Georgiou, P. A multitask learning approach to personalized blood glucose prediction. IEEE J. Biomedical Health Inf. 26, 1 (2021).
Google Scholar
Yang, M., Dave, D., Erraguntla, M., Cote, G. L. & Gutierrez-Osuna, R. Joint Hypoglycemia Prediction and Glucose Forecasting Via Deep multi-task Learning (IEEE, 2022).
Steininger, M., Kobs, K., Davidson, P., Krause, A. & Hotho, A. Density-based weighting for imbalanced regression. Mach. Learn. 110 (Aug), 8 (2021).
MathSciNet Google Scholar
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002).
Article Google Scholar
Aguiar, G., Krawczyk, B. & Cano, A. A survey on learning from imbalanced data streams: taxonomy, challenges, empirical study, and reproducible experimental framework. Mach. Learn. Jun 29, 1–79 (2023).
Google Scholar
Branco, P., Torgo, L. & Ribeiro, R. P. SMOGN: a pre-processing Approach for Imbalanced Regression (PMLR, 2017).
Huang, W. & Zhang, T. Multi Domain Imbalanced Regression (.
Chen, H. Y. & Chao, W. L. On bridging generic and personalized federated learning for image classification. arXiv preprint arXiv:2107.00778 (2021).
Silva, S. et al. Federated Learning in Distributed Medical Databases: Meta-analysis of large-scale Subcortical Brain Data (IEEE, 2019).
Yan, Z., Wicaksana, J., Wang, Z., Yang, X. & Cheng, K. T. Variation-aware federated learning with multi-source decentralized medical image data. IEEE J. Biomedical Health Inf. 25, 2615–2628 (2020).
Article Google Scholar
Yuan, B., Ge, S. & Xing, W. A federated learning framework for healthcare iot devices. arXiv preprint arXiv:.05083 (2020). (2020). (2005).
Brisimi, T. S. et al. Federated learning of predictive models from federated electronic health records. Int. J. Med. Inf. 112, 59–67 (Apr, 2018).
Grama, M. et al. Robust aggregation for adaptive privacy preserving federated learning in healthcare. arXiv preprint arXiv:.08294 (2020). (2020). (2009).
Moshawrab, M., Adda, M., Bouzouane, A., Ibrahim, H. & Raad, A. Reviewing Federated Machine Learning and Its Use in Diseases Prediction. Sensors (Basel), 23, 4Feb 13 2112. (2023).
Nguyen, D. C., Ding, M., Pathirana, P. N. & Seneviratne, A. Blockchain and AI-Based solutions to combat coronavirus (COVID-19)-Like epidemics: A survey. IEEE Access. 9, 95730–95753 (2021).
Article PubMed Google Scholar
Deng, T., Hamdan, H., Yaakob, R. & Kasmiran, K. A. Personalized federated learning for In-Hospital mortality prediction of Multi-Center ICU. Ieee Access. 11, 11652–11663 (2023).
Article Google Scholar
Kerkouche, R., Acs, G., Castelluccia, C. & Genevès, P. Privacy-preserving and bandwidth-efficient federated learning: An application to in-hospital mortality prediction. City, (2021).
Vaid, A. et al. Federated learning of electronic health records to improve mortality prediction in hospitalized patients with COVID-19: machine learning approach. JMIR Med. Inform. 9 (1), e24207 (Jan, 2021).
Lincy, M. & Kowshalya, A. M. Early detection of type-2 diabetes using federated learning. Int. J. Sci. Res. Sci. Eng. Technol. 12, 257–267 (2020).
Article Google Scholar
Liu, D., Dligach, D. & Miller, T. Two-stage Federated Phenotyping and Patient Representation Learning (NIH Public Access, 2019).
Liu, D. & Miller, T. Federated pretraining and fine tuning of bert using clinical notes from multiple silos. arXiv preprint arXiv:.08562 (2020). (2020). (2002).
Antunes, R. S., da Costa, C. A., Küderle, A., Yari, I. A. & Eskofier, B. Federated learning for healthcare: systematic review and architecture proposal. Acm Trans. Intell. Syst. Technol. 13, 1–23 (Aug, 2022).
Nguyen, D. C. et al. Federated learning for smart healthcare: A survey. ACM Comput. Surv. (CSUR). 55 (3), 1–37 (2022).
Article Google Scholar
Rieke, N. et al. The future of digital health with federated learning. NPJ Digit. Med. 3, 1 (2020).
Article Google Scholar
Xu, J. et al. Federated learning for healthcare informatics. J. Healthc. Inf. Res. 5 (1), 1–19 (2021).
Article MathSciNet Google Scholar
Astillo, P. V. et al. Federated intelligence of anomaly detection agent in IoTMD-enabled diabetes management control system. Future Generation Comput. Syst. 128, 395–405 (2022).
Article Google Scholar
De Falco, I. et al. A Federated Learning-Inspired Evolutionary Algorithm: Application to Glucose Prediction. Sensors (Basel), 23, 6Mar 8 2957. (2023).
Islam, H. & Mosa, A. A Federated Mining Approach on Predicting diabetes-related Complications: Demonstration Using real-world Clinical Data (American Medical Informatics Association, 2021).
Li, T., Sahu, A. K., Talwalkar, A. & Smith, V. Federated learning: Challenges, Methods, and future directions. IEEE. Signal. Process. Mag. 37 (May), 3 (2020).
Google Scholar
Mammen, P. M. Federated learning: Opportunities and challenges. arXiv preprint arXiv:2101.05428 (2021).
Niknam, S., Dhillon, H. S. & Reed, J. H. Federated learning for wireless communications: Motivation, Opportunities, and challenges. IEEE Commun. Mag. 58 (Jun), 6 (2020).
Google Scholar
Kairouz, P. et al. Advances and Open Problems in Federated Learning. Foundations and Trends in Machine Learning, 14, 1–2 1-210. (2021).
Shaik, T. et al. FedStack: Personalized activity monitoring using stacked federated learning. Knowledge-Based Systems, 257Dec 5 109929. (2022).
Chen, Y. Q., Qin, X., Wang, J. D., Yu, C. H. & Gao, W. FedHealth: A federated transfer learning framework for wearable healthcare. IEEE. Intell. Syst. 35, 4 (2020). Jul-Aug.
Article Google Scholar
Wu, Q., Chen, X., Zhou, Z., Zhang, J. & Fedhome Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Trans. Mob. Comput. 21, 2818–2832 (2020).
Article Google Scholar
Banos, O. et al. mHealthDroid: a Novel Framework for Agile Development of Mobile Health Applications (Springer, 2014).
Banos, O. et al. Design, implementation and validation of a novel open framework for agile development of mobile health applications. Biomed. Eng. Online. 14 (Suppl 2), S6 (2015).
Article PubMed PubMed Central Google Scholar
McMahan, B., Moore, E., Ramage, D., Hampson, S. & Arcas, B. A. y Communication-efficient learning of deep networks from decentralized data. PMLR, City, (2017).
Dinh, T., Tran, C., Nguyen, J. & N. and Personalized federated learning with Moreau envelopes. Adv. Neural. Inf. Process. Syst. 33, 21394–21405 (2020).
Google Scholar
Wang, K. et al. Federated evaluation of on-device personalization. arXiv preprint arXiv:.10252 (2019). (2019). (1910).
Piao, C. et al. Privacy preserved blood glucose level Cross-Prediction: an asynchronous decentralized federated learning approach. IEEE J. Biomedi. Health Info. (2025).
Rad, F. S. & Li, J. Privacy-Preserving glycemic management in type 1 diabetes: development and validation of a multiobjective federated reinforcement learning framework. JMIR Diabetes. 10, 1 (2025). e72874.
Google Scholar
Marling, C. & Bunescu, R. The OhioT1DM Dataset for Blood Glucose Level Prediction: Update 2020 (NIH Public Access, 2020).
Wadwa, R. P., Laffel, L. M., Shah, V. N. & Garg, S. K. Accuracy of a Factory-Calibrated, Real-Time continuous glucose monitoring system during 10 days of use in youth and adults with diabetes. Diabetes Technol. Ther. 20 (6), 395–402 (Jun, 2018).
Jeon, J. et al. Predicting glycaemia in type 1 diabetes patients: experiments in feature engineering and data imputation. J. Healthc. Inf. Res. 4 (Mar), 1 (2020).
Google Scholar
Jacobs, P. G. et al. Artificial intelligence and machine learning for improving glycemic control in diabetes: best practices, pitfalls, and opportunities. IEEE Rev. Biomed. Eng. 17, 19–41 (2023).
Article Google Scholar
Morid, M. A., Sheng, O. R. L., Kawamoto, K. & Abdelrahman, S. Learning hidden patterns from patient multivariate time series data using convolutional neural networks: A case study of healthcare cost prediction. J. Biomed. Inform. 111, 103565 (2020).
Article PubMed Google Scholar
Yu, L. et al. Spectrum prediction based on Taguchi method in deep learning with long short-term memory. IEEE Access. 6, 45923–45933 (2018).
Article Google Scholar
Clarke, W. L., Cox, D., Gonder-Frederick, L. A., Carter, W. & Pohl, S. L. Evaluating clinical accuracy of systems for self-monitoring of blood glucose. Diabetes Care. 10, 5 (1987). Sep-Oct.
Article Google Scholar
Contreras, I., Bertachi, A., Biagi, L., Vehí, J. & Oviedo, S. Using Grammatical Evolution to Generate Short-term Blood Glucose Prediction Models. City, (2018).
Zhu, T., Yao, X., Li, K., Herrero, P. & Georgiou, P. Blood glucose prediction for type 1 diabetes using generative adversarial networks. City, (2020).
Kriventsov, S., Lindsey, A. & Hayeri, A. The Diabits App for Smartphone-Assisted Predictive Monitoring of Glycemia in Patients With Diabetes: Retrospective Observational Study. JMIR Diabetes, 5, 3Sep 22 e18660. (2020).
Hameed, H. & Kleinberg, S. Comparing Machine Learning Techniques for Blood Glucose Forecasting Using free-living and Patient Generated Data (PMLR, 2020).
De Bois, M., Yacoubi, E., Ammi, M. & M. A. and Adversarial multi-source transfer learning in healthcare: application to glucose prediction for diabetic people. Comput. Methods Programs Biomed. 199, 105874 (Feb, 2021).
De Bois, M., Yacoubi, M. A. E. & Ammi, M. GLYFE: review and benchmark of personalized glucose predictive models in type 1 diabetes. Med. Biol. Eng. Comput. 60 (1), 1–17 (Jan, 2022).
Yang, T., Yu, X., Ma, N., Zhang, Y. F. & Li, H. R. Deep representation-based transfer learning for deep neural networks. Knowledge-Based Systems, 253Oct 11 109526. (2022).
De Falco, I. et al. An evolution-based Machine Learning Approach for Inducing Glucose Prediction Models (IEEE, 2022).
Zhu, T., Li, K., Herrero, P. & Georgiou, P. Personalized blood glucose prediction for type 1 diabetes using evidential deep learning and Meta-Learning. IEEE Trans. Biomed. Eng. 70 (Jan), 1 (2023).
Google Scholar
Yang, G., Liu, S., Li, Y. & He, L. Short-term prediction method of blood glucose based on Temporal multi-head attention mechanism for diabetic patients. Biomed. Signal Process. Control. 82, 104552 (2023).
Article CAS Google Scholar
Rubin-Falcone, H., Lee, J. & Wiens, J. Forecasting with sparse but informative variables: a case study in predicting blood glucose. City, (2023).
Shuvo, M. M. H., Titirsha, T., Gonzalez, G. L. & Islam, S. K. Enhancing personalized blood glucose prediction: deep learning insights from ablation study on wearable sensor data. IEEE Access (2024).
Moon, K., Kim, J., Yoo, S. & Cho, J. Personalized blood glucose prediction in type 1 diabetes using meta-learning with bidirectional long short term memory-transformer hybrid model. Sci. Rep. 15, 1 (2025).
Article Google Scholar
Ghimire, S., Celik, T., Gerdes, M. & Omlin, C. W. Deep learning for blood glucose level prediction: how well do models generalize across different data sets? Plos One. 19, 9 (2024). e0310801.
Article Google Scholar
Gallardo, C., García-Sáez, G., Subías, D., Rigla, M. & Hernando, M. E. Deep Transfer Learning for Glucose Prediction Adding Physical Activity Data in Type 1 Diabetes (IEEE, 2025).
Khamesian, S., Arefeen, A., Grando, M. A., Thompson, B. M. & Ghasemzadeh, H. Type 1 diabetes management using glimmer: Glucose level indicator model with modified error rate. arXiv preprint arXiv:2502.14183 (2025).
Kalita, D. & Mirza, K. B. Multivariate glucose forecasting using deep multihead attention layers inside neural basis expansion networks. IEEE J. Biomedical Health Informatics (2025).
Shen, Y. & Kleinberg, S. Personalized blood glucose forecasting from limited CGM data using incrementally retrained LSTM. IEEE Trans. Biomedical Engineering (2024).
Dayan, I. et al. C.-S. Federated learning for predicting clinical outcomes in patients with COVID-19. Nature medicine, 27, 10 1735–1743. (2021).
Abbas, S. R., Abbas, Z., Zahir, A. & Lee, S. W. Federated Learning in Smart Healthcare: a Comprehensive Review on privacy, security, and Predictive Analytics with IoT Integration (MDPI, 2024).
Aleppo, G. et al. Integration of continuous glucose monitoring data into an electronic health record system: single-center implementation. J. Diabetes Sci. Technol. 19, 426–430 (2025).
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial and Systems Engineering, Texas A&M University, College Station, USA
Dave Darpit, Alfredo Garcia, Madhav Erraguntla & Mark Lawley
Department of Computer Science and Engineering, Texas A&M University, College Station, USA
Kathan Vyas & Jagadish Kumaran Jayagopal
Department of Biomedical Engineering, Purdue University, West Lafayette, USA
Dave Darpit

Authors

Dave Darpit
View author publications
Search author on:PubMed Google Scholar
Kathan Vyas
View author publications
Search author on:PubMed Google Scholar
Jagadish Kumaran Jayagopal
View author publications
Search author on:PubMed Google Scholar
Alfredo Garcia
View author publications
Search author on:PubMed Google Scholar
Madhav Erraguntla
View author publications
Search author on:PubMed Google Scholar
Mark Lawley
View author publications
Search author on:PubMed Google Scholar

Contributions

Darpit Dave: Conceptualization, methodology, data curation, formal analysis, investigation, visualization, writing – original draft, writing – review and editing Kathan Vyas: Conceptualization, methodology, validation, visualization, data curation, writing – review and editing Jagadish Kumaran Jayagopal: Conceptualization, methodology, validation, visualization, data curation, writing – review and editing Madhav Erraguntla: Conceptualization, investigation, project administration, writing – review and editing Alfredo Garcia: Conceptualization, methodology, investigation, writing – review and editing Mark Lawley: Conceptualization, methodology, supervision, project administration, resources, writing – original draft, writing – review and editing.

Corresponding author

Correspondence to Mark Lawley.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical statement

For the data used in the current work, all data collection and methods were carried out in accordance with relevant guidelines and regulations. An informed consent was obtained from all participants and/or their legal guardian(s) for collection of data. Necessary experimental and data collection protocols were approved by Baylor College of Medicine and Texas Children’s Hospital, Houston.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Darpit, D., Vyas, K., Jayagopal, J.K. et al. A personalized federated learning-based glucose prediction algorithm for high-risk glycemic excursion regions in type 1 diabetes. Sci Rep 15, 38376 (2025). https://doi.org/10.1038/s41598-025-22316-4

Download citation

Received: 10 August 2025
Accepted: 26 September 2025
Published: 03 November 2025
Version of record: 03 November 2025
DOI: https://doi.org/10.1038/s41598-025-22316-4

Subjects

Abstract

Similar content being viewed by others

Deep transfer learning and data augmentation improve glucose levels prediction in type 2 diabetes patients

Enhancing severe hypoglycemia prediction in type 2 diabetes mellitus through multi-view co-training machine learning model for imbalanced dataset

Explainable hypoglycemia prediction models through dynamic structured grammatical evolution

Introduction

State of the Art

CGM-based glucose prediction

Data imbalance

Sampling-based approaches

Cost-sensitive approaches

Federated learning for healthcare

Personalized federated learning

Methods and materials

Clinical datasets and preprocessing

Ohio T1DM

TCH study

Data preprocessing

HH loss

Federated learning

Glucose prediction specific problem formulation

Local model

Central model

Federated (global) model

FedGlu model

Model architecture

Validation approach

Evaluation metrics

Model training/hyperparameter tuning

Evaluation metrics

Results

Advantage of HH loss function

TCH study dataset

Ohio T1DM dataset

Model training in a federated learning framework

Discussion

Insights and observations

Impact of ‘\(\:\varvec{\alpha\:}\)’ parameter

Federated models: who benefits the most?

Comparison with literature

Clinical applications

Limitations and future work

Conclusion

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical statement

Additional information

Publisher’s note

Supplementary Information

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links