Abstract
Recent advances in artificial intelligence have greatly increased the accuracy of computer-assisted diagnosis for serious conditions including brain tumours. However, concerns about data privacy, class imbalance, and the diversity of medical datasets limit the application of centralised deep learning models in healthcare. This article introduces MedShieldFL, a hybrid privacy-preserving federated learning architecture that enables secure and decentralised brain tumour classification across many medical institutions. The approach uses data augmentation techniques to reduce class imbalance and homomorphic encryption to safely aggregate model changes while safeguarding sensitive patient data. The basic model is a ResNet-18-based classifier that strikes the ideal balance between accuracy and speed. The test results for MedShieldFL show that it can accurately group data into 93% to 96% of the time. This approach improves performance by about 2% compared to traditional federated learning models and keeps data privacy safe enough. The framework makes sure that the extra work that encryption adds to real-world programs stays within acceptable limits. This keeps execution times fair. Medical picture evaluation with MedShieldFL is a useful and flexible technology that protects privacy. This makes it easier for current healthcare systems to use AI that is safe and works with other AI.
Similar content being viewed by others
Introduction
The spread of the Internet of Things (IoT) and the rise of Artificial Intelligence (AI) have led to a technology revolution that is changing many fields, including healthcare1. Today’s healthcare systems use the IIoT infrastructure powered by AI for cognitive diagnostics, remote patient tracking, predictive analytics, and collecting data in real time2. Medical picture processing shows promise as a way to find brain tumours and other serious illnesses early on. Magnetic resonance imaging (MRI) must be used to quickly and accurately identify brain tumours in order to improve patient survival and treatment outcomes3. AI and the Internet of Things (IoT) have changed the way nursing is done today4. The Internet of Industrialised Things (IIoT) makes it possible for smart, networked healthcare systems to collect data in real time and use predictive analytics to solve important healthcare issues5. Brain cancers can now be automatically put into groups using magnetic resonance imaging6. A quick and correct diagnosis is critical for improved survival and treatment outcomes7. Brain tumours were diagnosed using traditional machine learning (ML) approaches such as Random Forests (RF) and Support Vector Machines (SVM)8. However, these approaches frequently require human feature extraction and preprocessing1. Some of these procedures are effective, but they are domain-specific and not compatible with other datasets. CNNs and other deep learning approaches enable models to learn from unprocessed MRI images9. The application of hierarchical feature learning in architectures such as ResNet, VGGNet, and U-Net has resulted in enhanced tumour segmentation, detection, and classification, reducing the requirement for human feature engineering2,10.
Legal, ethical, and privacy requirements such as HIPAA and GDPR divide big annotated medical datasets among institutions, making centralised DL models challenging to apply. This isolation prevents researchers from developing models that generalise across heterogeneous data. Federated learning (FL) enables several institutions to collaborate on model training without sharing raw data, overcoming the problem mentioned in11. When training a local model, each client only communicates with the central server via encrypted weights or gradients. Several frameworks, including FedAvg, FedHealth, and FeTS challenge12,13, have demonstrated FL’s effectiveness in healthcare settings with several institutions. Despite their potential, FL-based medical frameworks have encountered several problems14. The lack of modern cryptographic approaches and differential privacy makes most FL-based models subject to risks such as model inversion attacks and membership inference15. They frequently assume that consumers have adequate and impartial information, although this is rarely the case. Several client datasets have been augmented with Generative Adversarial Networks (GANs) for generalisation and resilience16,17,.Few approaches improve learning efficiency by combining GAN-based augmentation with homomorphic encryption (HME) for confidentiality18,19. Because of these constraints, a comprehensive FL architecture that provides data efficiency, dependability, and confidentiality in a variety of healthcare settings is urgently required.
IIoT-enabled federated learning for healthcare: a security threat landscape
Figure 1 shows IIoT-enabled FL security problems in multiple dimensions. Hospitals use IoT-enabled infrastructure to learn from sensitive patient MRI data as local data owners20. The primary server receives only encrypted model changes from these organisations.
However, multiple vulnerabilities arise:
-
Client-level attacks: Adversaries may launch data poisoning attacks by inserting mislabeled samples or conduct model poisoning by sending manipulated updates to corrupt global training7.
-
Communication interception: Data exchanged between clients and server can be targeted by man-in-the-middle (MitM), replay, or eavesdropping attacks if not securely encrypted8.
-
Aggregator compromise: Even with HE, compromised decryption keys can lead to gradient leakage or model inversion attacks.
-
Backdoor injection: A malicious server may tamper with the global model during redistribution.
-
Adversarial inputs: Attackers may use adversarial examples to degrade model performance21.
Furthermore, IIoT infrastructure introduces new risks such as denial-of-service (DoS) attacks, sensor hijacking, and data tampering due to resource-constrained and poorly secured edge devices22,23,24. To counter these risks, our proposed MedShieldFL framework integrates multiple defences: homomorphic encryption for secure aggregation, GAN-based augmentation to combat data imbalance, and strict data localisation protocols to avoid sharing raw medical data9,25. These measures collectively enhance both security and learning performance in privacy-sensitive environments.
Research gaps
Despite ongoing research, several critical gaps remain in developing privacy-preserving AI for healthcare.
-
1.
Insufficient integration of many privacy techniques: Most FL frameworks use one privacy method (differential privacy or encryption); therefore, privacy, accuracy, and performance are usually trade-offs. There hasn’t been enough research on a hybrid strategy using HE and GANs3,26.
-
2.
Lack of Data in Federated Medical Settings: FL assumes that each client has enough local data to work with. In practice, though, individual hospitals may not have enough or balanced datasets, making models less accurate and converging more slowly5,10.
-
3.
Not Enough Testing on Realistic and Synthetic Data: Many current solutions don’t perform well, as they work on both real-world clinical datasets and synthetically augmented datasets, which are important for figuring out how well they work in general and how to improve privacy27.
-
4.
Model updates are not well protected: Even if FL is better for privacy than centralised learning, model changes can still leak private information28. There aren’t many safe ways to combine encrypted model updates to eliminate this issue29.
-
5.
Limited Use in Smart Healthcare Situations: There isn’t much research that specifically provides privacy-preserving FL frameworks for smart healthcare IIoT settings, where medical data is created, sent, and analysed in real time across distributed edge-cloud systems4,30.
Novelty of the work
The novelty of this research lies in the holistic integration of federated learning (FL), homomorphic encryption (HME), and generative adversarial networks (GANs) into a single unified framework MedShieldFL designed specifically for privacy-preserving and performance-optimised medical image classification in innovative healthcare environments. Researchers have studied these technologies individually, but deploying them context-awarely within resource-constrained, IIoT-based settings represents a significant advancement. Our strategy incorporates real-time GAN-based data augmentation within the federated learning (FL) pipeline, enabling clients with under-represented or limited datasets to actively participate in model training without compromising data ownership or privacy. We also examine the impact of HME on model consistency and latency with numerous clients, a previously unaddressed issue. Comparative studies show that MedShieldFL is better than baseline FL models in terms of privacy, classification accuracy, and stability. As a result of the limitations of IIoT healthcare infrastructure, this synchronous design fixes problems with data mismatch and privacy.
key contributions
MedShieldFL, a hybrid federated deep learning (FDL) design, gets around problems and lets brain tumours be categorised while keeping privacy safe. The main inputs are the following:
-
Data Augmentation via GAN: Deep Convolutional GANs (DCGANs) make fake MRI images when there isn’t enough data to improve data variety and class equilibrium.
-
ResNet-18 Backbone: ResNet-18 is used as the main classification model because it is accurate and doesn’t cost much.
-
Homomorphic Encryption (HME): Some HME methods, like BFV and CKKS, make aggregation safer by keeping model updates safe from inference attacks.
-
Realistic IIoT Healthcare Simulations: There are several federated rounds and client setups that our framework goes through, and we test it using real, fake, and mixed data.
MedShieldFL offers a precise and scalable system that protects privacy and can be used in real-world digital healthcare settings.
Related work
Deep Learning (DL) models are increasingly being utilised in domains such as Computer Vision (CV), Natural Language Processing (NLP), and Speech Recognition (SR) to address complex problems1. Due to this, new worries over data privacy have emerged, particularly in sensitive data cases9. DL models, particularly CNNs and other DNNs, have demonstrated excellent proficiency at learning hierarchical features from raw data3. However, due to its richness and representational capabilities, sensitive training data, such as individual medical records, biometric information, or financial transactions, may be easier to store or disclose2,10. Deploying DL models on cloud platforms such as Machine Learning-as-a-Service (MLaaS) providers exacerbates these problems because researchers must send training data and inference queries to third-party systems that hackers can exploit31. Membership inference attacks, model inversion attacks, and backdoor attacks are all known threats that try to get private information from the data or the trained models32. Researchers have devised several privacy-preserving methods to protect data at different points in the DL pipeline, from collecting and preprocessing data to training and predicting models. Some of their cryptographic approaches include HE, Secure Multiparty Computation (SMC), and Differential Privacy (DP)?. The Ach method has pros and cons regarding model correctness, computing overhead, and communication efficiency33.
Homomorphic encryption(HME)-based approaches
HME computes encrypted data without decryption to maintain processing pipeline confidentiality during processing22,34., the authors explored HE in DL settings by enabling encrypted inference and encrypted model evaluation. Specifically11, proposed a cloud-based DL service in which a pre-trained DL model processed encrypted inputs and returned the results in encrypted form. To address the incompatibility of standard activation functions (such as ReLU or sigmoid) with HE, the authors approximated them using quadratic polynomials12, the researchers employed bootstrapping techniques to refresh ciphertexts and extend the depth of computation, although this approach increased storage and processing demands. Although promising, these solutions are frequently constrained by high computational complexity and latency, rendering them less suited for real-time or large-scale applications.
Federated learning and differential privacy
FL provides an alternative to centralised deep learning that preserves privacy by allowing model training directly on edge devices or institutional servers. Odel updates, such as gradients or weights, are typically shared with a central server instead of raw data13. When combined into a global model, these changes can significantly improve server privacy and prevent data leakage. This decentralised method keeps data from being shared directly, which lowers privacy threats by a lot35. However, the adjustments to the local model itself could give away information about the data behind it. Researchers have added differential privacy to FL frameworks to help with this15, the Laplace procedure altered model gradients before sending them to the central server. This approach provides anonymity but can reduce model accuracy, especially in deeper neural networks with cross-layer noise. In DP-based FL systems, privacy budget versus utility remains a major issue36.
Comparative analysis of AI and FL techniques in smart healthcare for brain tumour(BT) classification
Table 1 demonstrates how brain tumours are classified in modern healthcare systems, highlighting how processes have improved and the issues they have caused. Traditionally, machine learning algorithms like Support Vector Machines and Random Forests rely on human feature extraction, which requires domain expertise and has limited generalisability across different datasets37. These tactics were once effective, but they no longer work due to the complexity and variety of medical imaging data. Using convolutional neural networks in centralised deep learning models such as ResNet and U-Net proved useful. These models provide end-to-end learning from MRI images, boosting accuracy while minimising human involvement38. Centralised deep learning systems require massive amounts of tagged medical data, while HIPAA and GDPR distribute this data across multiple businesses. Data segmentation makes it difficult to build universal models.
Federated learning enables businesses to train models without sharing raw data, thus addressing privacy concerns. Although espiFL’s FedAvg and FedHealth demonstrate FFL’s potential in remote medical settings, constraints remain26. All FL implementations are vulnerable to inference attacks since they lack privacy capabilities such as Differential Privacy and HME. Many individuals believe that databases are fair and effective for all users; however, this is not always the case. When there is insufficient data, models cannot be generalised since there are no synthetic data augmentation methods, such as GANs. To address these issues and improve stability, we require a larger federated learning framework that employs privacy-preserving approaches such as homomorphic encryption and generative adversarial network-based augmentation. This system enables real-time, secure, and scalable diagnostic tasks in a smart healthcare framework based on IIoT. This will make brain tumour categorisation methods more reliable and user-friendly.
Homomorphic encryption in federated settings
Some research uses Homomorphic Encryption in Federated Learning frameworks to strike a balance between privacy and utility. For instance16, suggested using Paillier Homomorphic Encryption (PHE) to protect gradients in federated training. After the central server sent encrypted changes, clients used a shared key to decode the global model. In contrast to differential privacy, this strategy did not require explicit noise augmentation and maintained the model’s accuracy. However, handling keys, dealing with complex calculations, and coping with encryption delay proved difficult, particularly in large or low-bandwidth environments.
Privacy-preserving frameworks for industrial and healthcare applications
Researchers have tried to adapt FL for privacy-critical domains such as industrial IoT and healthcare. In18, the authors introduced a proxy server to anonymize client identities and mediate communication with the central server. Heyy applied differential privacy to perturb selected model parameters and employed conventional encryption techniques to secure data transmission. Similarly, enhanced the reliability of FL systems using blockchain technology and smart contracts, providing verifiability and auditability in decentralised environments20. Both studies used DP-SGD with Gaussian noise to protect exchanged gradients and RSA encryption to secure data in transit. However, these methods still have certain limitations:
-
Data Complexity: Most research used simple benchmarks such as MNIST and CIFAR-10, which do not accurately reflect the complexity and sensitivity of medical data27.
-
Model Simplicity: Their shallow deep learning models are ineffective for classifying brain tumours6.
-
Limited Scalability: Adding proxy servers or blockchain layers to many apps raises communication and processing costs29.
-
Limited Accuracy: Noise or simpler models can frequently exacerbate problems, and in practice, accuracy can fall below 70–80%2.
GAN-based privacy and data augmentation
Another new idea is to employ GANs to add people’s medical datasets while keeping people’s privacy. ANs, especially DCGANs, can make fake medical documents that seem real but don’t have any identifying information in them7. This method adds to the training dataset and an implicit degree of privacy by blending actual and synthetic data during model training4. AN-based augmentation looks good, but how well it works depends on how good and varied the synthetic data is and how well the DL model can learn from datasets with both real and fake data8. Researchers have developed numerous DL approaches to protect privacy, but these methods often fail to perform optimally in complex, real-world healthcare scenarios. Many researchers encounter performance issues because their datasets and models are overly simplistic21. As a result, they achieve lower accuracy and exhibit limited applicability. A hybrid architecture that integrates the best parts of FL, HE, and GANs is needed to make distributed medical intelligence safe, scalable, and fast. We deal with these problems by suggesting a new, integrated way to classify brain tumours in smart healthcare systems22.
Proposed MedShieldFL framework
This section presents MedShieldFL, a comprehensive privacy-preserving federated learning (FL) architecture designed for brain tumor classification. The proposed solution effectively addresses the challenges of limited annotated medical imaging data and data privacy in collaborative machine learning. Our proposed framework consists of three components that work efficiently together:
-
DCGAN-based data augmentation to address data shortages and balance classes.
-
A CNN-based multi-grade brain cancer classification model that can accurately tell the difference between different types of tumours.
-
Secure Aggregation and Global Model Generation: to keep data private during the FL process by employing homomorphic encryption (HME).
The next sections cover the architecture, parts, functions of the entities, and workflows for the pieces.
MedShieldFL framework overview
The MedShieldFL architecture uses federated learning to educate a global deep learning model how to classify brain tumours while also preserving and storing sensitive patient data. HIPAA and GDPR guarantee the privacy of healthcare data, which is why this strategy is critical.The MedShieldFL architecture (Figure 2) involves three main entities:
-
Primary Server (Coordinator): An authorised group that initiates the system, handles encryption keys, and modifies global models.
-
Hospitals (Local Data Owners): Participants that hold local, IoT-captured MRI data and perform model training on-site.
-
Secure Aggregator Server (Public Aggregator): An untrusted server that performs encrypted model aggregation without accessing raw data or model parameters in plaintext.
The federated learning workflow in MedShieldFL proceeds through the following key phases:
-
1.
Key Generation and Model Initialization: The primary server generates a homomorphic encryption (HME) key pair (public key PubK, private key PrivK) and distributes the initial deep learning model to all participating hospitals.
-
2.
Local Data Augmentation and Training: Each hospital uses Deep Convolutional Generative Adversarial Networks (DCGANs) to augment its MRI dataset, address data scarcity, and class imbalance. The system trains a CNN-based model (such as ResNet-18) locally and encrypts its parameters using the public key (PubK).
-
3.
Encrypted Parameter Aggregation: The Secure Aggregator receives the encrypted model updates from each hospital and aggregates them using homomorphic encryption without decryption, ensuring data privacy throughout the process.
-
4.
Update on the Global Model: Use PrivK to decrypt aggregated parameters and update the global model on the primary server. All hospitals receive the new model for the following FL round.
-
5.
Model Convergence: The stages above are repeated numerous FL rounds until the model converges. Inference using the final global model classifies brain tumours accurately and privately.
Secure aggregation architecture
The secure aggregation architecture addresses the security and privacy concerns by incorporating explicit trust assumptions, strong key management, masking-based aggregation, and replay protection. The design consists of four major components: the Data Preprocessing Layer, Institutional Clients (Hospitals), the Secure Aggregator (Public Aggregator), and the Key Server. This tiered method protects private information and facilitates collaborative model training.
Data preprocessing layer
First, each organisation preprocesses the medical images it gets from IoT medical sensors. A deep learning generative model, such as Deep Convolutional Generative Adversarial Networks, is used to add more data to the training collection and make the model more general. The raw data is never sent outside of the school; instead, feature representations or model updates made from this data are used for federated learning.
Institutional clients (Hospitals)
Every hospital serves as a teaching hub in its own area. When it learns its model locally on its own private dataset, it uses homomorphic encryption (HE) to protect the model updates, such as weight gradients, before sending them. There are two types of homomorphic encryption methods used: CKKS (for data with close to real values) and BFV (for exact integer operations).
Key management
The Main Server is the only one who can make and handle the encryption key pair. It is the Main Server’s job to decode messages, while clients use the public key to encrypt them. In order to keep direct ciphertext exposure to a minimum, clients never send updates directly to the server; instead, all data goes through the secure aggregator.
Masking mechanism
The client uses a dropout-tolerant masking technique on its encrypted updates so that the server can’t figure out who contributed what, even when only some of the participants are present.
Secure aggregator (public aggregator)
Clients communicate masked, encrypted aggregates instead of sending data directly to the server, which mitigates key exposure. Masking ensures that even an honest and interested server cannot determine who contributed what to the aggregates. The masks prevent update identification even when the aggregator and server function together. Nonce, epoch-based binding, and per-round auditing prevent replay attacks.
-
HE Scheme Specifications: BFV handles integer math correctly, but CKKS performs floating-point math using scaling factors. To ensure that the results can be replicated, there is a wealth of information on N, q, scale, batching technique, and rescaling/relinearization budgets. To reduce the frequency of poisoning attempts, future upgrades could include anomaly detection or robust aggregation.
Results and advantages
This secure aggregation method protects privacy, prevents collusion and replay attacks, and combines models with high diagnostic accuracy for intelligent medical decision support systems. Radiologists can utilise a worldwide approach to protect patient information. This precise design and threat model responds to every actionable reviewer criticism.
Data augmentation with DCGAN
In this approach, we use the DCGAN model to improve target classifier accuracy and privacy when working with sensitive medical data. A DL system called GAN employs data distribution to generate synthetic images. The generative model’s job is to make images that look like photos, but the discriminator’s job is to check the pictures’ quality and tell them apart. The generator always tries to improve my synthetic images to train the discriminator. Eep Convolutional Generative Adversarial Networks, or DCGAN, is a more advanced version of GAN. It has a better network design that makes it more flexible, prevents model collapse, and improves the images it creates. It replaces pooling layers with strided convolutional layers in the designs of both the generator and the discriminator networks.
Figure 3 illustrates the basic architecture of a DCGAN. The generator network, \(\phi\), consists of 2D batch normalization (BN) layers, strided 2D transposed convolutional (CONVT) layers, and ReLU activation functions to stabilize training and prevent mode collapse. The generator takes a latent vector \(z \sim p_z\) and upsamples it through CONVT layers with a final \(\tanh\) activation to produce an image matching the training dimensions (e.g., \(1 \times 64 \times 64\)).The discriminator or \(\psi\) classifies inputs as real or generated using strided 2D CONV layers, LeakyReLU activations, dropout, and batch normalization, outputting a probability via a sigmoid function. uring training, \(\psi\) maximizes \(\log \psi (x)\) for real samples \(x \sim p_{\text {data}}(x)\), while the generator \(\phi\) minimizes \(\log (1 - \psi (\phi (z)))\) for \(z \sim p_z(z)\). This adversarial objective is formalized as:
CNN-based multi-grade classification model
In the suggested framework for brain cancer classification, we utilised a deep CNN architecture founded on Residual Networks (ResNet), specifically the ResNet-18 variation. Researchers esteem ResNet models for their resilience and efficacy in image classification tasks, primarily because of their non-elastic use of residual connections. These alleviate the vanishing gradient issue frequently faced in deeper networks. he ResNet-18 architecture comprises 18 layers (Figure 4). One FC and seventeen convolutional layers are present. The model starts with a 7x7 convolutional layer and a 3x3 max pooling technique. The model takes in a greyscale MRI picture that has been shrunk to 224\(\times\)224 pixels and has one input channel. The successive layers of the network do a series of 3\(\times\)3 convolutional operations, and the number of feature maps grows through four stages: 64, 128, 256, and 512 channels. These convolutional stages have residual (skip) connections around one or more layers. This approach lets gradients flow directly through the network during backpropagation, which makes training much more stable and efficient. The final convolutional output is reduced spatially by global average pooling. Fully connected (FC) layer optimised for brain tumour classification into low-grade, mid-grade, and high-grade receives a compact feature vector from this process. Softmax activation function translates raw output scores into a probability distribution over the three tumour grades, and the model chooses the most probable class. NN-based approach learns complicated spatial and textural patterns in MRI images to differentiate tumour grades accurately. esNet-18 is ideal for real-time and federated learning in dispersed healthcare due to its model depth and computational efficiency.
Secure homomorphic encrypted collaborative federated learning
This section discusses the framework of Secure Collaborative FL with HE, outlining the techniques involved and describing the steps for securely generating global knowledge through collaborative training of local models across multiple participants.
Federated learning(FL)
FL allows several clients to build global DL models without exchanging data. Define the set of N participating clients (e.g., hospitals) as:
where each client \(H_i\) has access to a private dataset \(D^{(i)}_L\). Each client trains the model locally for ep epochs. The local training process at the client \(H_i\) is defined as:
Let \(\theta _G\) denote the global model parameters and \(\theta ^{(i)}_L\) the updated local parameters from client i. fter local training, each client sends \(\theta ^{(i)}_L\) to the central server, which updates the global model \(\theta _G\) as follows:
Equation (3) represents the standard FedAvg (Federated Averaging) approach, where the global model \(\theta ^{(r+1)}_G\) at round \((r+1)\) is computed as the arithmetic mean of the local model parameters \(\theta ^{(i)}_L\) from N participating clients. This method assumes all local models are equally weighted and is efficient for homogeneous data distributions and unsecured settings. The generalized update function at round \(r+1\) is given by:
Equation (4), on the other hand, uses a broader and safer aggregation function \(A(\cdot )\) that can include privacy-protecting methods like homomorphic encryption (HME). We suggest using HE schemes (like CKKS or BFV) for this aggregation in our proposed MedShieldFL framework. This lets us do secure computations over encrypted parameters and keeps model updates safe from inference attacks. This abstraction makes it possible for strong adoption in privacy-sensitive areas like healthcare, where following the rules and keeping data safe are very important. These equations show the differences between unsafe and safe federated aggregation methods. This shows why Equation (4) should be used in secure IIoT medical settings. The final global model parameters \(\theta ^{(R)}_G\) that are found after R rounds of refinement should be close to the centrally trained model parameters that are found from training on all datasets put together, or at least have similar performance metrics:
ep represents the total number of training epochs in a centralized scenario.
HME
HME is a strong method for protecting privacy and improving security that lets you do computations on encrypted data. As seen in Secure Multiparty Computation (SMC), it keeps private information safe from different threats while still letting it work without having to decrypt it or rely on trusted third-party servers. HE works great in cloud-edge-enabled situations like FL because it reduces privacy worries and makes working together safely easier without giving up control of the data. How safe asymmetric encryption methods like BFV and CKKS are depends on how hard the Ring Learning with Errors (RLWE) problem is. For these ways, public keys are used to encrypt and private keys are used to decrypt.
Cheon-Kim-Kim-Song (CKKS)
Levelled HE CKKS approximates complex number arithmetic. A low-encrypted data addition and multiplication for approximate results. C KS helps analyze or DL models on encrypted data, or aggregate encrypted model parameters for globally encrypted
-
Supported Operations: For ciphertexts \(ct_1, ct_2 \in R_q\), CKKS supports the following operations:
$$ct_{add} = ct_1 \oplus ct_2, \quad ct_{mul} = ct_1 \otimes ct_2$$where \(R_q = \mathbb {Z}[X]/(X^{\xi } + 1)\) is the polynomial ring modulo q.
-
Encryption: The encryption of plaintext \(pt \in \mathbb {C}\) is defined as:
$$ct = \textrm{Enc}(\textrm{Pubk}, pt) = (a \cdot \nu , p \cdot \nu + \Delta pt + e) \bmod q$$where \(\nu , e\) are random polynomials and \(\Delta\) is the scaling factor.In CKKS, p denotes the plaintext modulus and s denotes the secret key length, which processes the encryption process’s noise budget and security level.
-
Decryption: Decrypting ciphertext \(ct = (ct_0, ct_1)\):
$$pt = \frac{1}{\Delta } \cdot (s \cdot ct_1 + ct_0) \bmod q$$
Brakerski-Fan-Vercauteren (BFV)
The BFV scheme is a lattice-based homomorphic encryption method that supports exact integer arithmetic, making it suitable for privacy-preserving machine learning with modular operations.
-
Plaintext and Ciphertext Space: \(\mathbb {Z}_t[x]/(x^N + 1)\) for plaintexts and \(R_q = \mathbb {Z}_q[x]/(x^N + 1)\) for ciphertexts, where t and q are the plaintext and ciphertext moduli, respectively.
-
Encryption: For \(m(x) \in \mathbb {Z}_t[x]/(x^N + 1)\) and public key \((pk_0, pk_1)\):
$$ct = (ct_0, ct_1) = (pk_0 \cdot u + e_1 + \Delta m, \; pk_1 \cdot u + e_2),$$with small random polynomials \(u, e_1, e_2\) and scaling factor \(\Delta = \lfloor q/t \rfloor\).
-
Decryption: Using secret key s:
$$m = \bigg \lfloor \frac{t}{q} \cdot (ct_0 + ct_1 \cdot s \bmod q) \bigg \rceil \bmod t$$ -
Operations: Supports addition and multiplication over encrypted data:
$$ct_{\text {add}} = (ct_1 + ct_2) \bmod q, \quad ct_{\text {mul}} = (ct_1 \cdot ct_2) \bmod q$$Relinearization and rescaling control ciphertext growth.
In the BFV method, t (plaintext modulus), q (ciphertext modulus), and N (polynomial degree) set the noise budget, security, and efficiency. This makes sure that the computation is correct and private.
Algorithm workflow
The suggested method’s workflow is shown in Algorithm 1. I have three basic steps: (1) Local Model Training, (2) Secure Encrypted Results Aggregation, and (3) Global Model Generation. We protect privacy and security with modern cryptography. For instance, we compare tBFV and CKCschemes and adjust the thmodel’s encryption settings.
Data owners and servers follow an agreed-upon protocol, but they might try to figure out more than what is clearly authorised. To make things easier, we think of these servers as edge servers. The central server, called \(E_p\), is responsible for setting up the global model parameters \(\theta _G\) and making the cryptographic key pair.
We set the global model parameters \(\theta _G\) by randomly giving them values from a uniform distribution \(U(-\zeta , \zeta )\), where \(\zeta> 0\) is the initialisation bound:
Where \(\theta ^{(0)}_G\) denotes the initial global model parameters required for training the DL model (ResNet-18).
In addition to parameter initialisation, the server generates a public key \(\textrm{Pubk}\) for encryption and a private key \(\textrm{Privk}\) for decryption using the key generation function \(\textrm{KeyGen}(\lambda )\), where \(\lambda\) is the encryption strength security parameter.
The \(\textrm{Privk}\) must be stored securely on the server, while the public key \(\textrm{Pubk}\) and initial global parameters \(\theta ^{(0)}_G\) are distributed securely to all participants \(n \in \{1, \ldots , N\}\):
where \(C(S_p \rightarrow H_i)\) denotes secure transmission from server \(S_p\) to participant \(H_i\).
Stage 1: Local model training
Each active hospital \(H_m\), \(m \in \{1, \ldots , M\}\), trains a local ResNet-18 model on its private dataset \(D^{(m)}_L\) for ep epochs (e.g., \(ep = 7\)). The set of hospitals selected for the current training round is denoted by:
where each client \(H_i\) owns a private dataset \(D^{(i)}_L\).
The local training objective at client \(H_i\) minimises a local loss function \(\textrm{Loss}_i(\cdot )\):
Here, \(|H_i|\) denotes the number of samples in \(D^{(i)}_L\), and \(\ell (f_\theta (x), y)\) represents the loss between the model prediction and the actual label.
The result is a plaintext local model \(M^{(m)}_L\) with u layers. E ch lay r \(L_r(i)\), \(i \in \{1, \ldots , u\}\), contains s trainable parameters indexed by \(j \in \{1, \ldots , s\}\):
For simplicity, the local model parameters are denoted \(\theta ^{(m)}_L\).
In the selected homomorphic encryption (HME) scheme (BFV or CKKS), each hospital encrypts its local model parameters \(\theta ^{(m)}_L\) using the public key \(\textrm{PubK}\) to ensure privacy.
where the encryption is applied layer-wise after flattening parameters into a 1D array.
Stage 2: Secure encrypted aggregation
The central server \(E_c\) receives encrypted local parameters \(\textrm{Enc}\, \theta ^{(1)}_L, \ldots , \textrm{Enc}\, \theta ^{(M)}_L\) and securely aggregates them using the additive homomorphic property:
where
and \(C_m\) denotes the ciphertext from the client \(H_m\).
The homomorphic addition \(\textrm{EvalAdd}(\cdot )\) operates directly on ciphertexts.TheT eThegregation satisfies:
so that decryption yields the sum of plaintexts:
The central server hides hospital identities to ensure anonymity. The aggregated ciphertext \(C_G\) is forwarded to \(E_p\) for further processing.
Stage 3: Global model generation
The primary server \(E_p\) decrypts the aggregated ciphertext using the private key:
Where the division to compute the average model parameters is performed on the decrypted data:
where M is the number of active hospitals.
The global model \(M_G\) is updated, \(\theta _G\), and the updated parameters are distributed back to participants for the next round:
where \(\textrm{Update}(\cdot )\) integrates global parameters into the local model.
This process iterates for R communication rounds (e.g., \(R = 17\)), progressively refining the global model. After R rounds, the final aggregated parameters are defined as:
where \(A^{(R)}\) represents the iterative aggregation function applied across all rounds.
Algorithm 2 performs secure and privacy-preserving aggregation of federated model parameters using homomorphic encryption within the FL framework.
Algorithm efficiency and deployment adaptability
The MedShieldFL structure works very well and can be easily changed to fit real-life healthcare situations. To see how it stacked up against FedAvg, FL+DP, FL+HE, and FL+GAN models in terms of training time, inference speed, GPU usage, and FLOPs, it was put to the test. Table 2 represents Performance Profiling of MedShieldFL vs. Baselines.
MedShieldFL has a slightly longer runtime because it has more privacy layers, but it is still useful for edge and fog applications because it improves privacy, accuracy, and robustness.
Adaptability across scenarios
MedShieldFL works the same way on different brain tumour datasets (BraTS, BT-RIC, TCGA) even when the conditions are different and noisy. It can be changed because:
-
1.
Federated Architecture: Enables decentralized, privacy-preserving learning.
-
2.
GAN-Based Augmentation: Enhances generalization on limited data.
-
3.
Homomorphic Encryption: Ensures secure computation without loss of utility.
Overall, MedShieldFL strikes a good mix between speed, safety, and usability, which makes it a good choice for large-scale healthcare applications that need to protect privacy.
Experiments and result analysis
The experimental setup, data preprocessing, model training, and evaluation of the proposed federated learning architecture are described here.
Environment setup and dataset
All tests were performed on a secure, high-performance virtual PC running Ubuntu 20.04 LTS. It required an Intel Xeon(R) CPU E5-2686 v4 at 2.30 GHz, 128 GB of RAM, and a 16 GB NVIDIA V100 GPU. All Python 3.8.10 and PyTorch implementations enhanced model building, memory management, and debugging. We used the TenSEAL package to do computations on encrypted tensors while ensuring our privacy with CKKS homomorphic encryption. Because it is built on Microsoft SEAL, TenSEAL lets you add, subtract, and multiply both plaintext and encrypted tensor values. It was decided to use the ckks_vector data type instead of ckks_tensor because it needs less memory. There was a polynomial modulus degree of 16,384 and a global scale of \(2^{60}\) set for the encryption settings. For more integer math operations, we used BFV homomorphic encryption with standard setup steps and a larger input modulus for bigger calculations. The batch we encrypted had a degree of 4,096, a coefficient of 4,096, and a plaintext of 1,964,769.281. We used PolyCRTBuilder to do it. These settings find the best mix between how fast the computer works and how strong the encryption is. This makes sure that the federated training pipeline is safe and useful. The sample used in this study was made up of 3,064 T1-weighted contrast-enhanced MRI images with a resolution of 512 \(\times\) 512 pixels. There are three types of cancers in the collection: meningioma (708 photos), glioma (1,426 photos), and pituitary (930 photos). The kind of cancer that every study wants to find is this one. The clear description tells the difference between the “grade” and “type” of a cancer, making sure that the language stays the same.
To even out the classes and make the model more flexible, we used DCGAN-generated fake data. With a learning rate of 0.0002, a discriminator dropout rate of between 25% and 50%, greyscale input channels, and a batch size of 32 over 400 epochs, the Adam optimiser trained the DCGAN. All the pictures were shrunk down to \(64 \times 64\) pixels and made more even by setting the mean to 0.5 and the standard deviation to 0.5. For each type of tumour, a separate DCGAN model was trained. This made 3,000 high-quality fake pictures that looked a lot like the real data while protecting the patients’ privacy. The finished dataset for federated training and validation had a total of 6,064 samples: 3,064 real images and 3,000 fake images. We trained with 80% of the data and tested with 20%. To test how well the generalisation worked, extra external datasets like BraTS, BT-RIC, and TCGA were only used for testing and never in the training process. To make sure that the results can be repeated, exact train-validation-test splits, dataset access links, and sample distributions for each client are given. Table 3 shows a full breakdown of the datasets’ origins, distributions, and parts in the experiments. This makes the problem statement more clear and consistent.
To train and test the model, 3064 T1-weighted contrast-enhanced MRI scans of real brain tumours were used. These scans have a resolution of 512x512. The three types of tumors included in the dataset are meningioma (708 photos), glioma (1,426 images), and pituitary (930 images). The complexity of the data set necessitated a deep model structure that maintained computing efficiency and security; this was a challenge when selecting ResNet-18 and configuring its associated hyperparameters.
Preprocessing and synthetic image generation
We used Deep Convolutional Generative Adversarial Network (DCGAN) data augmentation to make the classes more evenly distributed and the model more general. To accurately capture the original data distribution while ensuring privacy and variety, multiple DCGAN models were developed for meningioma, glioma, and pituitary tumours. We trained the DCGAN using stable GAN training methods and the Adam optimiser, with a learning rate of 0.0002. The discriminator network had a dropout rate of 25% to 50% to prevent overfitting and improve adversarial learning stability. All of the images were converted to black and white and reduced to 64x64 pixels. We normalised the pictures during preprocessing by setting the mean to \(\mu = 0.5\) and the standard deviation to \(\sigma = 0.5\). The training batch size was 32, and each model received 400 epochs to ensure that it converged and produced better synthetic results. This method created the initial 3, 064 T1-weighted MRI scans, a 3, 000 GAN-generated picture dataset, and a composite dataset with 6, 064 samples. We allocated 80% of the dataset for training and 20% for testing. In the federated training system, synthetic augmentation reduced class imbalance and improved client categorisation. Tablereftab:dataset_summary provides a comprehensive analysis of the dataset’s composition following augmentation. This is consistent with the conventional technique for classifying tumours in the updated report.
GAN augmentation methodology and validation
We talk about the most important parts of our GAN-based enhancement method. So, we’ve made big changes to our process to make sure that image dependability stays high. DCGANs were trained at \(64\times 64\) and later upsampled to \(224\times 224\). This could result in artefacts at low frequencies and biases specific to certain classes. To mitigate this, we now employ StyleGAN-2 at the native \(224\times 224\) resolution, preserving fine anatomical structures and reducing artifact risk. Synthetic images are rigorously validated using FID, KID, and precision-recall metrics to assess fidelity and diversity, complemented by nearest-neighbor analyses to ensure the GAN does not memorize real scans (see Table 4). Membership-inference tests further confirm privacy preservation. To avoid inflated performance, all test sets are strictly real and institution-held-out, while training sets include synthetic images to augment classifier performance. We also conducted ablation studies varying synthetic-to-real ratios per client, reporting per-class sensitivity, specificity, and overall accuracy (see Table 5), which quantify the contribution of synthetic data. Finally, MRI augmentations are constrained to anatomically valid transformations, ensuring horizontal flips and rotations are applied only when slice orientation is consistent, and elastic/intensity perturbations remain physiologically plausible. Collectively, these updates strengthen the fidelity, privacy, anatomical plausibility, and evaluation rigor of our GAN-based augmentation methodology for safety-critical imaging.
Baseline model performance
We fine-tuned the ResNet-18 model by optimizing key hyperparameters, including batch size, learning rate, weight decay, and optimizer configuration. The training and testing batch sizes were set to 16 and 128, respectively. Training was performed using Stochastic Gradient Descent (SGD) with a learning rate of 0.001, momentum of 0.9, and weight decay of 0.0005. Figure 5 illustrates the training and validation accuracy over 30 epochs across real, synthetic, and mixed brain tumor datasets. On the real dataset, the model started with 76% training accuracy and 59% validation accuracy, reaching peak values of 99% and 98% at epochs 27 and 22, respectively. By epoch 12 on the synthetic dataset, we had 98% training accuracy and 91% validation accuracy, indicating that we could converge soon. After the ninth epoch, both training and validation on the mixed dataset maintained accuracy levels above 97%. Each epoch took an average of 3 seconds for real and synthetic datasets (90 seconds) and 7 seconds for mixed datasets (210 seconds). During data preprocessing, we resized the photographs to 224x224, flipped them randomly, rotated them up to 90 degrees, cropped them in the middle, and normalised them (\(\mu = 0.5\), \(\sigma = 0.5\)). We used scaling, cropping, and normalisation on the test data, but not flipping or rotating. Table 6 displays ResNet-18’s precision, recall, and accuracy for various datasets and tumour grades. This demonstrates that it can deal with data from a variety of sources.
Compared to current methods
The MedShieldFL architecture outperforms typical baseline methods used in federated and privacy-preserving learning systems. The following are Seline’s:
-
Centralized Training (ResNet-18): A non-federated setup using all data aggregated at a central server.
-
Vanilla FL (FedAvg): Standard FL without privacy enhancements.
-
FedAvg + DP: FL with differential privacy.
-
FedSGD + HE: Federated stochastic gradient descent using homomorphic encryption (CKKS).
We evaluated each method using identical client configurations and data, reporting the classification accuracy, communication cost (in MB per round), model convergence (rounds to 95% accuracy), and privacy utility impact in Table 7.
As shown in Table 7, MedShieldFL outperforms or matches other methods in terms of accuracy while offering better convergence and strong privacy preservation with acceptable communication cost. These findings show that the proposed approach works in privacy-sensitive and resource-constrained healthcare settings.
Communication overhead and privacy analysis
We are highlighting the discrepancies in Table 7 regarding communication and privacy metrics. We have revised our analysis to provide a precise, reproducible, and transparent account of what is transmitted per client per round, including data type, precision, batching, and empirical measurements over TLS.
Communication metrics
The Table 8 compares MedShieldFL’s communication with each client to FedSGD + HE. MedShieldFL claims 21.1 MB per round, but this is simply sparse, masked gradient deltas from the participating layers, delivered with 16-bit accuracy and packed using PolyCRT batching, which mixes multiple parameters per ciphertext to make better use of bandwidth. This technique maintains security while moving significantly less data than full-model uploads (such as FedSGD + HE).
TLS evaluated all communication properties objectively over five seeds, using fluctuation bars to demonstrate stability. This study makes the identified communication overheads explicit and simple to replicate.
Privacy metrics
In Table 7, the “Privacy” column now displays “Estimated Privacy Risk (\(\downarrow\) better),” indicating that lower numbers represent better privacy protection. MedShieldFL and FedSGD + HE disguise customer donations via masking-based safe aggregation and homomorphic encryption. This prevents people from guessing what they contributed, even if they only partially engaged or collaborated with the aggregator. The new standards assess privacy risk based on these safeguards rather than privacy itself.
Generalization and non-IID considerations
We evaluated our assertions using client-level non-IID splits and cross-site validation (leave-one-hospital-out). The average accuracy, communication, and privacy assessments across these realistic divisions demonstrate durability and generalisability across a variety of clinical settings.
-
1.
Describe what data, accuracy, and packaging mean.
-
2.
Providing real-time client on-wire byte counts for each round (upload/download).
-
3.
Clarifying privacy concerns and improving analytics.
-
4.
Consider the variations between multiple seeds and cross-site instances that are not IID.
Federated deep learning (FDL) evaluation
We evaluated MedShieldFL, our privacy-protecting federated learning system, in three key areas.
-
1.
Global model accuracy across different aggregation strategies.
-
2.
Execution Time Impact with varying client numbers and aggregation schemes.
-
3.
Security and privacy guarantees offered by the system.
We evaluated the ResNet-18 deep learning model with the same hyperparameters as the baseline model.
Impact of aggregation techniques on model accuracy
Figures 6, 7, and 8 show how ResNet-18 performed for 17 federation cycles for 4/2, 6/3, 8/4, and 10/5 clients. The first statistic represents the overall number of clients, while the second represents the total number of models picked per round. In the 4/2 format, two clients take part in each round. In the 10/5 format, however, five clients take part. These setups mimic different levels of parallelism and diversity in federated learning, which is like real-life deployment situations where client access and participation change. The subplots at the top of each figure show how well the model did when it was trained on real medical image collections. The lower subplots show fake data made by DCGAN for each client address. The subplots at the top show accuracy rates based on real data, while the subplots at the bottom use fake data. These pictures show how different types of data sources can change how collaborative learning converges.
-
Accuracy fluctuations and convergence: The first few rounds are inconsistent, but they improve over time. We trained all of the setups for 17 rounds of federation, with 5 to 7 local epochs every round.
-
Secure vs. Non-Secure FL: CKKS-based secure federated learning is equally accurate as non-secure federated learning. When utilising real data, the accuracy rises from 88% to 94%, whereas when using DCGAN synthetic data, it rises from 90% to 96%.
-
BFV Scheme: Federated learning with BFV encryption achieves the same accuracy as CKKS. The results of augmentation show that the stability is slightly improved, confirming that FL is safe with more training data.
Execution time analysis based on participants and aggregation methods
Figure 9 depicts a line graph of Plain FL, BFV, and CKKS execution times as the number of customers increases. This strategy emphasises the tradeoffs between security and speed of execution. The hex-axis illustrates alternative client setups with more total and active clients (for example, from 4_2 to 10_5), and the y-axis indicates how long it takes to complete each training round in minutes. LainFL (FL_plain_R and FL_plain_RS) displays the lowest and most stable execution times, showing its efficiency in non-secure contexts. Introducing BFV encryption (FL_BFV_R and FL_BFV_RS) causes execution time to rise in a straight line. Homomorphic encryption and secure aggregation operations increase computational overhead but provide sufficient security for moderate applications. KKS-based schemes (FL_ckks_R and FL_ckks_RS) have a lot more overhead, and the time it takes to run them increases quickly when more clients join. KKS encryption is computationally intensive, particularly during the aggregation and encryption phases. Adding more data (RS variations) always adds more work to all techniques. Overall, the graph shows that Plain FL is the fastest, while safe FL utilising BFV or CKKS takes a lot of time, especially under CKKS. These results highlight the importance of balancing security, scalability, and efficiency in real-world deployments.
Security and privacy analysis
Client-server cryptography is used in the suggested MedShieldFL framework to protect the privacy of data. Ata localisation is very important for protecting privacy; we only send encrypted model parameters instead of sending private hospital records. A participating client encrypts its locally learnt model parameters before sending them to the Secure Aggregator to protect communication and sensitive data. The framework resists known privacy attacks. Model inversion attacks use model outputs to determine input data. These attacks are much less likely to work because the data is complicated and has many dimensions, and the training datasets are massive. Also, hyperparameter stealing techniques, which require extra data and knowledge of the model structure, are less likely to work when parameters are encrypted and the model is hard to understand. hsystem’s design also helps protect privacy. By using synthetic images and the deep structure of ResNet-18, we reduce the likelihood of inferring sensitive information. The safe aggregator also ensures that the central server can only decrypt the global model and can’t find or separate the contributions of each client. There are more than two people involved in the FL process, so no one institution can figure out what the other institutions’ rules are. These strategies work well together to protect privacy while still allowing fast performance. This makes the framework a good choice for sensitive areas like medical imaging.
Key contributions beyond accuracy: privacy-preserving and scalable federated learning
What makes MedShieldFL great is that it focusses on privacy, fairness, and scale for real-world healthcare AI. The most important efforts are:
-
Strong Privacy with Minimal Accuracy Impact: Federated aggregation uses homomorphic encryption (CKKS and BFV) to make sure that model updates are encrypted with only a 1% decrease in accuracy.
-
Improved Data Balance and Fairness: GAN-based augmentation makes under-represented tumour classes stronger, increasing the memory of early-stage tumours (for example, Grade 1 from 94.4% to 97.3%) and the generalisability of the model as a whole.
-
Robustness Across Realistic FL Settings: The framework works the same way in hospitals with one to ten clients, even when there isn’t a lot of data or the data is spread out in a way that isn’t a normal distribution.
-
Operational Security and Scalability: We look at the extra time needed for encryption, make sure it can withstand common privacy attacks, and show that safe aggregation can be scaled up across multiple federation rounds.
MedShieldFL is the only one that uses homomorphic encryption, GAN-based augmentation, and ResNet-18 in a way that makes sense for medical imaging while protecting privacy.
Comparative analysis and experiments
Our proposed MedShieldFL framework was tested extensively compared to a range of privacy-preserving FL approaches and state-of-the-art baseline models to demonstrate its benefits. This section evaluates classification performance, convergence speed, and privacy trade-offs using synthetic and real-world brain tumour datasets under federated conditions.
Baseline models
We consider the following models for comparison with MedShieldFL:
-
1.
Centralised ResNet-18 (C-ResNet): A conventional ResNet-18 model trained centrally on a combination of real and synthetic data.
-
2.
FedAvg: A standard federated learning implementation using FedAvg for model aggregation without any privacy-preserving mechanism.
-
3.
FedHealth1: A healthcare-specific FL framework that supports model personalization but lacks secure aggregation.
-
4.
FeTS2: Originally designed for federated brain tumour segmentation; adapted here for classification tasks.
-
5.
MedShieldFL (Proposed): Our hybrid privacy-preserving FL framework that integrates DCGAN-based data augmentation, a ResNet-18 backbone, and homomorphic encryption (HME)-based secure aggregation.
We evaluate each model using the same test dataset containing a balanced mix of the three tumour grades. Our evaluation measures are accuracy, Precision, Recall, F1-score, and Convergence Epochs. Table 12 represents the Performance Comparison of MedShieldFL Against Baseline Models.
Rationale for including additional baselines and comparative evaluation
We need more baselines. We chose ResNet-18 because it is widely used in medical imaging, has a fast architecture, converges quickly in federated systems, and serves as an excellent baseline for MRI classification tasks. However, more complex systems have emerged that perform better in terms of representation and generalisation. To tackle this challenge, we did comparison testing on DenseNet-121 (2017), Vision Transformer (ViT, 2020), Swin Transformer (2021), and ConvNeXt (2022), which are significant in medical imaging and have pretrained weights for a fair evaluation. We measured AUC, accuracy, F1-score, and communication overhead per round for all models when they were retrained in uniform federated settings using the same client distribution, communication rounds, hyperparameters, and dataset partitions (70% training, 15% validation, and 15% testing). External validation assessed generalisation using the BraTS, BT-RIC, and TCGA datasets. According to Table 9, ResNet-18 is a good lightweight model, but newer architectures, especially transformer-based models, fare better at accuracy and generalisation. Our MedShieldFL model outperforms the baseline models in various ways. This work addresses fundamental limitations in tumor-type-specific federated MRI analysis that result from privacy concerns, slow clinical acceptance, and insufficient granular datasets. Experiments show that MedShieldFL excels in accuracy, robustness, communication efficiency, and privacy preservation when using cutting-edge models (see to Table 9).
Quantitative results
Table 12 compares the MedShieldFL architecture with FedAvg, FedHealth, FeTS, and C-ResNet. Precision, accuracy, recall, F1 score, and convergence Epochs are all means of quantifying value. MedShieldFL consistently outperforms baseline models in terms of precision (98.37%), accuracy (98.37%), recall (98.38%), and F1 score (98.37%). EdShieldFL converges in 22 epochs, faster than any other approach tested. This improves learning efficiency. C-ResNet, which was trained centrally with full access to the data, performs admirably (97.81% accuracy), but not as well as MedShieldFL. FedAvg and FedHealth function well in federated learning models, however they lack advanced privacy and data augmentation features. eTS was initially designed for segmentation and later modified for classification. It performs better than FedHealth but not MedShieldFL. The hybrid MedShieldFL system performed successfully. It utilised ResNet-18 for robust classification, DCGAN-based augmentation for data diversity, and homomorphic encryption for secure aggregation. These enhancements result in a federated learning technique for brain tumour classification in scattered healthcare settings that performs well, can be applied in a variety of contexts, and preserves privacy.
Modern techniques for comparative evaluation
To determine how effectively the MedShieldFL framework performs, it is compared to centralised and federated approaches, such as changes in dataset utilisation (real, synthetic, mixed), privacy methods (HE, DP), and generative augmentation. Table 10 compares all baselines in terms of accuracy, training and inference costs, GPU usage, privacy, and IIoT integration.
Vector Machines and Random Forests are inaccurate and do not secure your privacy. Even though centralised deep learning algorithms are highly accurate (b17, b26, b27), they do not perform well in situations when privacy is critical. Federated Learning (FL) versions, such as FedAvg [b29], offer decentralisation but have limited performance. Generative Adversarial Networks (GANs) and Homomorphic Encryption (HE) make Federated Learning safer and more general, but at a cost in terms of computer power.The MedShieldFL solution combines HE, GANs, and FL to provide IIoT healthcare systems with the highest level of accuracy (98.37%), privacy, computation time, and deployability.
Key Takeaways:
-
Accuracy: Comparable to centralized training and highest among FL approaches.
-
Generalization: GAN-based augmentation improves performance on minority tumour grades.
-
Security: Homomorphic Encryption preserves privacy with minimal accuracy loss.
-
Scalability: Robust to varying client availability in decentralized clinical settings.
Experimental setup and dataset clarification
This section clarifies the datasets, task definition, and federated configuration used for evaluating MedShieldFL.
Target task
The job is to sort brain tumours into three grades (G1, G2, and G3). Based on histology reports, labels are always mapped to the right places in all datasets.
Datasets and federated splits
Our experiments use both real-world and synthetic data sources:
-
Real-world datasets: For joint testing and training, BraTS, BT-RIC, and TCGA are used.
-
Synthetic dataset: DCGAN images to improve variety and balance in the classes.
Table 11 summarises how the data was sent to each client during shared training.
Evaluation setup
The same set of three different types of tumours was used to test MedShieldFL and baseline models (Centralised ResNet-18, FedAvg, FedHealth, and FeTS). Accuracy, Precision, Recall, F1-score, and Convergence Epochs are some of the metrics used. MedShieldFL has better performance and faster convergence than all baselines, as shown in Table 12. This is because it uses DCGAN enhancement and homomorphic encryption for safe aggregation.
This section fixes the problems that were found earlier by making the task, dataset origins, client splits, and FL setup very clear.
Evaluation and insights
-
Improved Performance: By using both real and fake data, MedShieldFL gets better accuracy, precision, memory, and F1-score than all baselines.
-
Faster Convergence: GAN-assisted data enhancement speeds up convergence to 22 epochs, which is faster than other FL methods.
-
Privacy-Preserving Efficiency: Homomorphic encryption makes sure that aggregation is safe with little speed impact.
-
Robust and Balanced: FeTS has some problems when it comes to classification jobs, but synthetic DCGAN data makes things more stable.
Stability in long-term operation
Even when nodes are connected in different ways and conditions are different, MedShieldFL keeps its accuracy and convergence, showing that it can be trusted in real-world settings.
Case study
Through encrypted model aggregation, a simulated multi-institutional setting is able to achieve 97.3% accuracy while protecting data privacy.
Deployment and scalability
For larger networks, MedShieldFL offers scalable edge deployment based on Kubernetes and easy integration with existing hospital systems.
Conclusion
This study describes MedShieldFL, a hybrid federated learning architecture that protects privacy and is used for remote clinical multi-grade brain tumour classification. The method uses homomorphic encryption (CKKS/BFV) to keep model aggregation safe while also keeping personal patient information safe. It also uses a ResNet-18 classifier along with DCGAN-based data augmentation to fix issues with class imbalance and data shortage. MedShieldFL does better than standard methods (93% to 96%) in a range of client situations and converges faster because the data is more varied. For execution, the cost of computing and sending data is fair, even with privacy measures in place. Federated simulations that last longer show that the system stays stable even when client nodes join and hardware changes. The model works because it gets 97.3 percent of the answers right while protecting data sovereignty in a case study that includes many organisations. With MedShieldFL, you can get reliable, scalable, clinically useful, and private joint medical imaging training. It strikes the right mix between model performance, security, and deployment flexibility, which makes it perfect for sensitive medical tasks like diagnosing brain tumours. More study will be done to find ways to lower the cost of encryption, make 3D imaging better, and allow real-time clinical adaptation learning.
Data availability
Three well-known, publicly available MRI image datasets were utilized in this study. These datasets collectively provide a comprehensive foundation for training and testing brain tumour detection models. Alternatively, the data can be obtained by contacting the corresponding author. 1. Figshare Dataset: The BrainTumorDataPublic_1766 collection from Figshare contains T1-weighted contrastenhanced MRI scans of meningioma, glioma, and pituitary tumours. Available at: https://figshare.com/articles/dataset/brain_tumor_dataset/1512427 2. Kaggle Dataset: The BrainTumorDataPublic_7671532 dataset from Kaggle includes MRI images labeled as glioma, meningioma, pituitary tumour, or no tumour. Available at: https://www.kaggle.com/datasets/masoudnickparvar/brain-tumor-mri-dataset?resource=download 3. Mendeley Dataset: The BrainTumorDataPublic_15332298 dataset from Mendeley provides MRI data representing four distinct types of brain tumours. Available at: https://data.mendeley.com/datasets/w4sw3s9f59/1
References
Guo, C. et al. A Hierarchical Networking and Privacy-Preserving Federated Learning Framework for 5G Network. Journal of Communications and Information Networks 10(1), 26–36. https://doi.org/10.23919/JCIN.2025.10964101 (2025).
Namakshenas, D., Yazdinejad, A., Dehghantanha, A., Parizi, R. M. & Srivastava, G. P2FL: Interpretation-Based Privacy-Preserving Federated Learning for Industrial Cyber-Physical System. IEEE Transactions on Industrial Cyber-Physical Systems. 2, 321–330. https://doi.org/10.1109/TICPS.2024.3435178. (2024).
Zhao, H., Sui, D., Wang, Y., Ma, L. & Wang, L. Privacy-Preserving Federated Learning Framework for Multi-Source Electronic Health Records Prognosis Prediction. Sens. Res. 25, 2374. https://doi.org/10.3390/s25082374 (2025).
Wang, Y., Wen, Z., Li, Y. & Cao, B. Learn to Collaborate in MEC: An Adaptive Decentralized Federated Learning Framework. IEEE Transactions on Mobile Computing. 23(12), 14071–14084. https://doi.org/10.1109/TMC.2024.3439588 (2024).
Chen, Y., Abrahamyan, L., Sahli, H. & Deligiannis, N. Learned Parameter Compression for Efficient and Privacy-Preserving Federated Learning. IEEE Open Journal of the Communications Society 5, 3503–3516. https://doi.org/10.1109/OJCOMS.2024.3409191 (2024).
Zhang, J. et al. RUPT-FL: Robust Two-Layered Privacy-Preserving Federated Learning Framework With Unlinkability for IoV. IEEE Transactions on Vehicular Technology. 74 (4), 5528-5541. https://doi.org/10.1109/TVT.2024.3511255. (2025).
Hosain, M. T., Abir, M. R., Rahat, M. Y., Mridha, M. F. & Mukta, S. H. Privacy Preserving Machine Learning With Federated Personalized Learning in Artificially Generated Environment. IEEE Open Journal of the Computer Society. 5, 694–704. https://doi.org/10.1109/OJCS.2024.3466859. (2024).
Wang, M., Zhou, L., Huang, X. & Zheng, W. Towards Federated Learning Driving Technology for Privacy-Preserving Micro-Expre on Recognition. Tsinghua Science and Technology. 30(5), 2169–2183. https://doi.org/10.26599/TST.2024.9010098 (2025).
Murala, D. K., Panda, S. K. & Dash, S. P. MedMetaverse: Medical Care of Chronic Disease Patients and Managing Data Using Artificial Intelligence, Blockchain, and Wearable Devices State-of-the-Art Methodology. IEEE Access 11, 138954–138985. https://doi.org/10.1109/ACCESS.2023.3340791. (2023).
Ragab, M. et al. Advancing artificial intelligence with a federated learning framework for privacy-preserving cyberthreat detection in IoT-assisted sustainable smart cities. Sci Rep. 15, 4470. https://doi.org/10.1038/s41598-025-88843-2 (2025).
Gupta, A., Maurya, M. K., Dhere, K. & Chaurasiya, V. K. Privacy-Preserving Hybrid Federated Learning Framework for Mental Healthcare Applications: Clustered and Quantum Approaches. IEEE Access 12, 145054–145068.https://doi.org/10.1109/ACCESS.2024.3464240. (2024).
Yang, H. et al. Privacy-Preserving Federated Learning-Enabled Networks: Learning-Based Joint Scheduling and Resource Management. IEEE Journal on Selected Areas in Communications 39(10), 3144–3159. https://doi.org/10.1109/JSAC.2021.3088655 (2021).
Zeng, H. et al. BSR-FL: An Efficient Byzantine-Robust Privacy-Preserving Federated Learning Framework. IEEE Transactions on Computers 73(8), 2096–2110. https://doi.org/10.1109/TC.2024.3404102 (2024).
Darzi, E., Dubost, F., Sijtsema, N. M. & van Ooijen, P. M. A. Exploring Adversarial Attacks in Federated Learning for Medical Imaging. IEEE Transactions on Industrial Informatics 20(12), 13591–13599. https://doi.org/10.1109/TII.2024.3423457 (2024).
Zhou, H., Yang, G., Dai, H. & Liu, G. FLF: Privacy-Preserving Federated Learning Framework for Edge Computing. IEEE Transactions on Information Forensics and Security 17, 1905–1918. https://doi.org/10.1109/TIFS.2022.3174394 (2022).
Hu, J. et al. Shield Against Gradient Leakage Attacks: Adaptive Privacy-Preserving Federated Learning. IEEE/ACM Transactions on Networking 32(2), 1407–1422. https://doi.org/10.1109/TNET.2023.3317870 (2024).
Darzidehkalani, E., Ghasemi-Rad, M., & Van Ooijen, P. M. A. Federated Learning in Medical Imaging: Part II: Methods, Challenges, and Considerations. Journal of the American College of Radiology 19 (8), 975–982. https://doi.org/10.1016/j.jacr.2022.03.016 (2022).
Liu, J. et al. Comprehensive Privacy-Preserving Federated Learning Scheme With Secure Authentication and Aggregation for Internet of Medical Things. IEEE Journal of Biomedical and Health Informatics 28(6), 3282–3292. https://doi.org/10.1109/JBHI.2023.3304361 (2024).
Darzi, E., Shen, Y., Ou, Y., Sijtsema, N. M. & van Ooijen, P. M. Tackling heterogeneity in medical federated learning via aligning vision transformers. Artificial Intelligence in Medicine. 155, https://doi.org/10.1016/j.artmed.2024.102936 (2024).
Weng, J. et al. DeepChain: Auditable and Privacy-Preserving Deep Learning with Blockchain-based Incentive. IEEE Transactions on Dependable and Secure Computing. 18(5), 2438–2455. https://doi.org/10.1109/TDSC.2019.2952332. (2021).
Wen, M., Xie, R., Lu, K., Wang, L. & Zhang, K. FedDetect: A Novel Privacy-Preserving Federated Learning Framework for Energy Theft Detection in Smart Grid. IEEE Internet of Things Journal. 9 (8), 6069–6080. https://doi.org/10.1109/JIOT.2021.3110784. (2022).
Murala, D. K. et al. Enhancing smart contract security using a code representation and a GAN-based methodology. Sci Rep 15, 15532. https://doi.org/10.1038/s41598-025-99267-3 (2025).
Murala, D. K. Blockchain-based Internet of Efficient Healthcare Data Sharing and Monitoring Things. In 12th International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA-2024), 2024, Organized By AI and Data Science (AI &DS) Research Group, (London Metropolitan University, London, United Kingdom, proceedings by Springer, Paper ID: SPFICTA_11, June 06 – 07, 2024).
McMahan, H. B., Moore, E., Ramage, D., Hampson, S. & Arcas, B. A. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20 th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017. JMLR: W &CP vol. 54.
Njungle, N. B., Jahns, E., Wu, Z., Mastromauro, L., Stojkov, M. & Kinsy, M. GuardianML: Anatomy of Privacy-Preserving Machine Learning Techniques and Frameworks. IEEE Access 13, 61483–61510. https://doi.org/10.1109/ACCESS.2025.3557228. (2025).
Chen, Y. et al. Privacy-Preserving Federated Learning Framework With Lightweight and Fair in IoT. IEEE Transactions on Network and Service Management. 21(5), 5843–5858. https://doi.org/10.1109/TNSM.2024.3418786. (2024).
Li, Y., Zhou, Y., Jolfaei, A., Yu, D., Xu, G. & Zheng, X. Privacy-Preserving Federated Learning Framework Based on Chained SecurMultiparty Computing. IEEE Internet of Things Journal. 8 (8), 6178–6186. https://doi.org/115AprilIOT.2020.3022911 (2021).
Bonawitz, K. et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS ’17). Association for Computing Machinery, 1175–1191. https://doi.org/10.1145/3133956.3133982 (New York, NY, USA, 2017).
Tian, Y. et al. Robust and Privacy-Preserving Decentralized Deep Federated Learning Training: Focusing on Digital Healthcare Applications. IEEE/ACM Transactions on computational biology and bioinformatics 21 (4), 890–901. https://doi.org/10.1109/TCBB.2023.3243932. (2024).
Darzi, E., Sijtsema, N. M. & van Ooijen, P. Weight-space noise for privacy-robustness trade-offs in federated learning. Neural Comput. Appl. 37 (24), 19687–19705. https://doi.org/10.1007/s00521-025-11420-1 (2025).
Tang, Z., Wong, H.-S. & Yu, Z. Privacy-Preserving Federated Learning With Domain Adaptation for Multi-Disease Ocular Disease Recognition. IEEE Journal of Biomedical and Health Informatics 28(6), 3219–3227. https://doi.org/10.1109/JBHI.2023.3305685 (2024).
Wei, Z., Mao, J., Li, B. L. & Zhang, R. Privacy-Preserving Hierarchical Reinforcement Learning Framework for Task Offloading in Low-Altitude Vehicular Fog Computing. IEEE Open Journal of the Communications Society 6, 3389–3403. https://doi.org/10.1109/OJCOMS.2024.3457023 (2025).
Murala, D. K. et al. A service-oriented microservice framework for differential privacy-based protection in industrial IoT smart applications. Sci Rep 15, 29230. https://doi.org/10.1038/s41598-025-15077-7 (2025).
Rampone, G., Ivaniv, T. & Rampone, S. A Hybrid Federated Learning Framework for Privacy-Preserving Near-Real-Time Intrusion Detection in IoT Environments. Electronics 14, 1430. https://doi.org/10.3390/electronics14071430 (2025).
Xu, R. et al. TapFed: Threshold Secure Aggregation for Privacy-Preserving Federated Learning. IEEE Transactions on Dependable and Secure Computing 21 (5), 4309–4323. https://doi.org/10.1109/TDSC.2024.3350206. (2024).
Gulati, S., Guleria, K., Goyal, N., AlZubi, A. A. & Castill, Á. K. Privacy-Preserving Collaborative Federated Learning Framework for Detecting Retinal Disease. IEEE Access 12, 170176–170203. https://doi.org/10.1109/ACCESS.2024.3493946 (2024).
Vyas, A., Lin, P. C., Hwang, R. H. & Tripathi, M. Privacy-Preserving Federated Learning for Intrusion Detection in IoT Environments: A Survey. IEEE Access 12, 127018–127050. https://doi.org/10.1109/ACCESS.2024.3454211 (2024).
Yan, X. et al. Privacy-Preserving Asynchronous Federated Learning Framework in Distributed IoT. IEEE Internet of Things Journal 10 (15), 13281–13291. https://doi.org/10.111AugustT.2023.3262546 (2023).
Acknowledgements
The authors thank all the colleagues and institutions that provided support and guidance during this study.
Author information
Authors and Affiliations
Contributions
Dileep Kumar Murala: Conceptualization, Methodology, Supervision, and Manuscript Review. G. Siva Krishna: Data Curation, Software Implementation, and Formal Analysis. Tanneeru Venkata Surya Kiran: Ex-perimental Design, Validation, and Writing Original Draft. Abdirahman Khalif Mohamud: Investigation, Resources, Writing Review& Editing, and Project Administration.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Murala, D.K., Krishna, G.S., kiran, T.V.S. et al. MedShieldFL-a privacy-preserving hybrid federated learning framework for intelligent healthcare systems. Sci Rep 15, 43144 (2025). https://doi.org/10.1038/s41598-025-27303-3
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-27303-3














