Abstract
Nowadays, transportation relies heavily on vehicular cyber-physical systems (VCPS), which improve intelligent transportation systems (ITS) with advancements like real-time traffic control and self-driving vehicles. Because the data that these devices handle is sensitive, they not only make it possible for automobiles, roadside units (RSUs), and base stations to connect seamlessly, but they also present serious security issues. Intrusion can result in dangers to safety of the public, monetary losses, and a decline in confidence in these vital services. This paper presents a novel federated learning design intended to improve data security in VCPS in order to overcome these issues. Federated learning guarantees the privacy of raw data by enabling decentralised model training within individual vehicles or RSUs. In order to additionally protect privacy when aggregating local models into a global one, the framework includes the Laplace method to add noise into model updates. RSUs, vehicles, and a centralised server collaborate in the secure framework to stop leaks of information that occur during communication and model training. The suggested method beats conventional cryptography techniques when tested using the CICIDS2017 dataset, preserving significant levels of confidentiality and safety without sacrificing computing speed or accuracy of the model. Developing such sophisticated security measures will be essential to maintaining the integrity and dependability of transportation systems as VCPS develops, which will eventually result in improved safety and efficiency in transportation.
Similar content being viewed by others
Introduction
The Vehicular cyber-physical systems (CPS) combine physical and computational resources to communicate with one another and the outside world in order to enhance entertainment reliability, security, and effectiveness in transportation. Vehicles in the Vehicular-CPS are able to analyse traffic data onsite and exchange accurate data with other vehicles via vehicular networks. Applying intelligent video content protection systems has been made possible by recent technological breakthroughs in areas like mobile detecting, 5G, and wireless connectivity. Many different types of data, both structured and unstructured, are usually produced by Vehicular-CPSs. The effective utilisation of accessible data by automobiles and RSU’s is hindered by limited computation and caching resources. To tackle these problems, multi-access edge computing, or MEC, is a viable model1,2.
Through centrally computing in vehicles or RSUs in a Vehicular CPS, MEC simplifies and allows growing computational-intensive and time-sensitive usage, like autonomous transportation, allocation of resources3,4, and offloading computation5. Artificial intelligence (AI) is a potent tool capable of making precise forecasts and handling challenging issues. AI algorithms are frequently used in Vehicular CPS to address difficult issues. Specifically, AI algorithms for sharing information and content caching have helped to lower costs and improve the usefulness of Vehicular CPS. For example, in6, deep reinforcement learning was used to optimise cache and compute resources together, improving system utility. The deep-reinforcement-learning-based technique used in5 maximised the system utility of device-to-device content caching.
Yet, the majority of existing efforts concentrate on sharing information and allocation of resources. Highly confidential information, such as the operating condition of vehicles, may be present in the Vehicular CPS data. Vehicles and RSUs which handle and cache this data as servers at the edge are susceptible to security breaches. The uninvited disclosure of information to unapproved parties is referred to as the threat of data loss7. In the event that the RSUs and vehicles are compromised or turn hostile, there might be a significant data leak that causes accidents and significant financial loss. The risk of data loss might be further increased by content caching and data exchange across cars. In Vehicular CPS, there are still many unanswered questions regarding data leak detection, security threat mitigation, and security modelling.
The Fig. 1, in VCPS, a communication model is created to detect data leaks. Our suggested approach for detecting leakage of data, in which RSUs cache resources from cars, other RSUs, and the base station. Vehicle-to-BS (V2B), vehicle-to-road (V2R), and vehicle-to-vehicle (V2V) communication are used to provide information for content caching along with data exchange. Two integrated elements of the suggested model—intelligent data transformation and cooperative data leakage identification across both RSUs and vehicles—address the aforementioned issue8,9. The primary function of adaptive transformation of data is to translate unprocessed data into modified formats in order to reduce the possibility of data leakage during subsequent steps like content caching and data exchange. Since many cars often give the desired data in application scenarios like vehicular data sharing, collaborative mapping across multiple vehicles is necessary to guarantee the usability of the data after processing. Here, we use federated learning to translate raw data from several parties into learnt data models10,11, hence learning the data features across multiple parties. Further enhancing confidentiality, the learnt model includes legitimate information to be used in other activities like resource allocation without disclosing their raw data12.
The detailed description of each component is explained below:
-
1.
Vehicles to vehicles transmission: Vehicle-to-vehicle (V2V) communication is a means by which various vehicles can interact with one another13. Every vehicle gathers and analyses its own local data, builds a local model, and merely exchanges updated models with other vehicles and roadside units.
-
2.
V2R and R2R communication via roadside units: RSUs serve as facilitators, exchanging and caching data obtained from vehicles. Roadside-to-roadside (R2R) connectivity is used by RSUs to exchange data with one another and to acquire data from other vehicles. RSUs deliver the combined data to the central base station after aggregating the local updates to models that are obtained from vehicles.
-
3.
R2B communication base station: The roadside-to-base station (R2B) method of communication allows the base station to acquire collected information from several RSUs. These collected updates are combined by the base station to produce a global model, which is subsequently returned to the vehicles and RSUs for additional improvement.
Motivation
Through the integration of both physical and computational capabilities, vehicular-CPS are leading the way in revolutionising transportation by improving overall efficiency, safety, trustworthiness, and value as entertainment. Through the use of vehicular networks and instantaneous traffic analysis, these systems greatly enhance congestion control and safety by allowing vehicles to communicate precise information among other vehicles. Vehicular CPS has been further boosted by recent technology breakthroughs in mobile identification, 5G, and wireless communication, opening the door for complex applications like intelligent video content security systems14.
But there are a lot of obstacles because of the massive volumes of structured and unstructured data that Vehicle CPS generates. Vehicles and RSUs that have restricted computing and caching capacities find it difficult to make effective use of the data that is accessible15. In order to overcome these obstacles, MEC shows promise as a model for centralised computing in vehicles or RSUs. Applications that are time-sensitive and computationally challenging, like allocation of resources, compute offloading, and autonomous transportation, are supported by MEC. By managing complicated problems and producing accurate projections, AI significantly improves vehicular CPS, especially in the areas of content caching and information exchange.
Problem definition
Even though vehicular CPS has made great strides and has many advantages, there are important security issues that must be resolved. Highly sensitive data, including vehicle operating parameters, is frequently handled by these devices. As servers at the edge, RSUs and vehicles can experience security lapses that could result in data loss. Sensitive material disclosed without authorization to unapproved parties may have serious repercussions, such as mishaps and large financial losses16.
The processes of content caching and data interchange between vehicles increase the danger of data loss. There might be a big data loss in the case of an RSU or vehicle breach or aggressive acquisition17. The majority of present investigations concentrate on the distribution of resources as well as information exchange18,19, leaving significant gaps in our knowledge of and ability to mitigate risks to security, loss of data, and safety modelling within Vehicular CPS. Thus, by examining efficient techniques for leakage of data detection and security threat prevention, and thorough security modelling in Vehicle CPS, this study seeks to bridge these gaps20. Through doing this, we hope to improve these systems’ overall durability and safety, guaranteeing their safe and dependable operation.
Contributions
The main contributions are summarized as :
-
1.
With a focus on privacy and security of data, we present an extensive federated learning architecture designed for automotive cyber-physical systems. This structure preserves the localization of private information while allowing it to gain from collective model updates.
-
2.
Using locally acquired data, the process entails training local models inside each vehicle. Keeping data private requires taking this step because raw data is never taken out of the vehicle itself. Updates to the local models are safely distributed and combined to improve the model as a whole.
-
3.
We suggest a technique for combining locally updated models from different cars. This aggregating model improves the global model while protecting the privacy of individual car data by capturing federated learning from all participating vehicles.
-
4.
We present the FedBuff architecture, which first buffers local model updates before forwarding them to the central server. FedBuff stands for Federated Learning with Buffered Aggregation. This aggregation improves model training effectiveness and adaptability.
-
5.
The CICIDS2017 dataset is used to assess the suggested framework. Our methodology is shown to be more efficient and secure when compared to other approaches such as MD5, SHA-2, RSA, and DES, based on performance parameters like hash code generation time and encryption/decryption times.
Literature review for federated learning techniques for preserving security of data in vehicular cyber-physical systems
In order to improve entertainment21, effectiveness, and security in transportation, VCPS integrates physical and computational resources. They converse over vehicle networks and process traffic data locally. Developments in 5G, wireless networking, and remote sensors have sped up the use of sophisticated VCPS. Because of their constrained computing and caching capabilities, these systems produce enormous volumes of both organised and unorganised information, which presents problems for cars and roadside units (RSUs). approaches for differential privacy throughout training, such as output and objective disruption (e.g.,22,23). Bit-choosing algorithms (e.g.,24) for aggregated data that safeguard the confidentiality of users. Allocating encryption resources according to implementation time and security weight (e.g.,25). The Table 1 shows the comparative analysis:
Preliminaries
Federated learning
In FL issues, data from numerous remote devices is used to train a single global statistical model. Figure 2 illustrates a conventional FL design. The objective of FL is to analyse the data collected by the device to train the model within the limitations of local caching. It also aims to update the parameters of the model on a regular basis so that they can be exchanged with the cloud parameter server. Stated otherwise, the purpose is to minimise the average loss of training for every customer, which is represented by the subsequent objective function in Eq. 1:
in which, k signifies the total devices that are participating in the training the model; \(q_m\) k defines the relative weight of impact assigned to every individual device; \(G_m(d)\) defines the localized objective function of the mth device; \(G_m(d)\) is defined in Eq. 2:
in which \(b_m\) denotes the large volume of data of the mth device; \(g_a(d, r_a, w_a)\) denotes the loss function of the proposed model having the parameter d on the instance \((r_a, w_a)\) in the mth device of localized dataset.
Methodology
The proposed methodology comprise of various steps to execute federated learning in vehicular-CPS that ensures the privacy of data. The Fig. 3 shows the workflow of the proposed methodology. The steps included are:
The Fig. 4 demonstrates a federated learning framework for vehicular networks, in which every car gathers and trains machine learning models on its own local data without disclosing the unprocessed information, thus protecting privacy. A central coordinating entity, such as a server or RSU, is depicted at the middle of the design and receives the locally trained model updates (represented by green neural icons). The changes are combined by this server to create a global model, which is subsequently sent back to each participant car. In intelligent transportation systems, this periodic approach supports safe, scalable, and real-time learning while maintaining the privacy of data and facilitating cooperative model improvement throughout the network.
-
1.
Local model updates: In a vehicular cyber-physical system, for example, every participant trains a machine learning model using local data, a process known as the local model updates. Subsequently, the modified model parameters are disseminated to a central server or to other people participating directly, who combine the changes to enhance an overall model. This procedure protects each participant’s local data privacy while enabling the global model to gain from their combined understanding. Every vehicle \(v_i\) updates its local model \(m_i (t)\) deploying its data locally by defining \(d_i\). The update is executed by following the various steps in the direction opposite of the gradients is shown in Eq. 3 and Fig. 4:
$$\begin{aligned} m_i(t) = m_i(t-1) - l \nabla G_i(w_t) \end{aligned}$$(3)where \(m_i (t)\) defines the model of vehicle i at time t; \(m_i(t-1)\) defines the model of vehicle i at time t-1; l is represented as the learning rate which states the hyperparameter that manages the size of step during the gradient descent; \(\nabla G_i(w_t)\) shows the gradient of the loss function at the local model; \(G_i\) in regard to the model parameter (\(w_t\)).
-
2.
Laplace mechanism for preserving the data: A popular method in the realm of distinct confidentiality is the Laplace mechanism, which adds controlled noise to data or query results in order to safeguard the confidentiality of individuals. The mechanism ensures that the inclusion or removal of one point of information has no major impact on the analysis’s conclusion by leveraging the Laplace distribution to achieve confidentiality requirements. The Laplace mechanism is deployed to the local model for ensuring the privacy is shown in Eq. 4:
$$\begin{aligned} \hat{m_i}(t) = m_i + LN(0, \sigma ^2) \end{aligned}$$(4)where \(\hat{m_i}(t)\) signifies the distorted local model of the vehicle i at the time t; \(m_i(t)\) explains the original local model of vehicle i at time t; \(LN(0, \sigma ^2)\) is the laplace noise that is added to the model in which noise is drawn from a laplace dispersion between the mean value 0 and variance \(\sigma ^2\). By hiding the precise values of the model parameters, applying noise to the model aids in the preservation of different levels of privacy. By doing this, adversaries are prevented from deducing confidential details about the training data for the model.
-
3.
Aggregation of model: A selected group of individuals is shared with the vehicles’ distorted models, and every vehicle upgrades its model by combining the acquired models. The model aggregated for every vehicle is j after receiving the updates from a subset SU of individuals is shown in Eq. 5:
$$\begin{aligned} m_j(t) = m_j(t-1) + \frac{1}{|SU|} \sum _{k \in SU}\hat{m_k}(t) \end{aligned}$$(5)where \(m_j(t)\) denotes the model updated for every vehicle j at time t; \(m_j(t-1)\) denotes the model updated for every vehicle j at time t-1; SU signifies the subset of individuals from which the vehicle j has received the distorted models; |SU| shows the number of individuals in subset SU; \(\hat{m_k}(t)\) represents the distorted model acquired from the vehicle k at the time t. By aggregating the distorted models obtained from the subset SU of individuals, the above formula updates the vehicle j model. This cooperative strategy protects data privacy while enhancing the global model.
-
4.
Aggregation by global model: The global model, which represents the combined learning of all participating vehicles, is created by combining all of the local models. The aggregation of global model is represented in Eq. 6:
$$\begin{aligned} M(t) = \sum _i m_i(t) \end{aligned}$$(6)where M(t) denotes the global model at the time t; \(m_i(t)\) explains the loal model of vehicle i at time t. To create a global model, this equation combines the local models from each participant vehicle. The global model ensures a thorough and cooperative learning process by representing the collective information from all vehicles.
Secure architectural design for vehicle cyber-physical systems to identify data leaks
By utilizing federated learning (FL), the suggested architecture aims to improve security and privacy in vehicular CPS scenarios. The architecture is designed to tackle the escalating issues of safety, confidentiality, and effective real-time data processing in vehicular networks, that have turned into vital parts of intelligent transportation systems.
System model
The system consists of a central federated server, roadside units (RSUs), and several vehicles as shown in the Fig. 5. The following are the main elements and their functions: Vehicles: Every vehicle in the vehicular network has sensors installed to gather information and identify errors. Using the data they have gathered, the vehicles do training locally and produce local updates to the models. RSUs, or roadside units: By serving as go-between, these devices help cars and the federated server communicate. They transmit the federated server the combined local updates from the cars. Federated Server: Using the combined data from RSUs, the server modifies and updates the global model. It also manages the federated learning procedure’ general coordination.
A strong authentication method is in place to protect transmission of information and connection within the vehicle network. Elliptic Curve Cryptography (ECC)-based cryptographic keys are mapped to publicly accessible data (such as car number, type, and colour) in order for vehicles to be associated with the federated server. The IPP-ROT technique is used to create cypher text. Hashing is done with the DF-HAVAL algorithm for vehicle identification in order to avoid data collisions and gearbox delays. In this, two keys are generated i.e. public and private by deploying the ECC technique and two random numbers (X,Y) are chosen from ECC Eq. 7 that is shown as:
Later, the random number \(\lambda\) is chosen in the range between [1, m-1] and later the public keys are computed by employing as, \(V=\beta .L\), where, \(\beta\) is the private key and L signifies the point showing on the curve. Based on these keys the cipher text is generated deploying IPP-ROT technique. Therefore, the public key V and private key \(\beta\) is acquired to create the cipher text. Logins are essentially just security precautions used to prevent unwanted access to private information. The person using the account is not permitted to connect to the server if the login attempt is unsuccessful or if the username and password are different27. Using the username, password, and cypher text, the user logs in to the federated server during this step. In the event that the login succeeds, the user’s identity is validated in the subsequent stage. Verification is a crucial process that is primarily used to prevent unauthorised people from obtaining data linked to vehicles, therefore maintaining the quality of the system that is suggested. In the next stage, when the user has been validated, some parameters are retrieved.
An authentication procedure begins as soon as the car detects the data. In order for authentication of each car with the RSUs, hash creation is done. Because vehicles are travelling and crossing the RSUs simultaneously, this is done to prevent data collisions and gearbox delays28. As a result, following data sensing, hash codes are produced utilising a vehicle’s geolocation and the sender’s node ip that is created at registrations. Next, in order to determine which vehicle falls under which RSU, the produced hash code is authenticated using RSUs. The process of authenticating a remote vehicle involves confirming its identification before allowing it to utilise a service. By finding the appropriate RSUs for communication, this improves the efficiency of the data delivery. The work proposed deploys the DF-HAVAL algorithm for the generation of hash codes. Let us say, the geolocation \(G_a\) and sender’s node address \(NS_p\) are input to the proposed hashing technique, that is in the message form with a total length of 256 characters. At the initial stage, the input \(G_a\), \(NS_p\) are divided into \(u_b\) of 2048 bits where b = 1, 2, 3,...,n that is the total number of inputs data partitioned which also signifies the length of input data.
The FedBuff (Federated Learning with Buffered Aggregation) architecture makes sure about the protection of private information in federated learning optimization via various mechanisms. Firstly, the server stores every local model that is acquired in the buffer B, and when the buffer is full, the server updates the global model. Though, because of few privacy concerns, the buffer is insecure but still to over this, a secure aggregation protocol is developed, that permits the user to transmit the updates for the protection of privacy. This protocol turns on the server to aggregate the local updates for beyond that is stored on the server without having access to any other information regarding the users updates.
The FedBuff model is used to construct the federated learning architecture, which forms the basis of the suggested architecture. To maintain confidentiality, every vehicle uses its dataset to build a local model without sharing the raw data. The RSUs then get the local model changes. The local changes are buffered by the RSUs before being regularly sent to the federated server. As additional information is received, the federated server updates the global model independently, enabling scalability and effective model training. The buffered asynchronous aggregation method updates the global model. After that, the vehicles receive the updated global model again for additional local training.
The Fig. 6 illustrates the Buffered Aggregation Workflow in the FedBuff architecture, where multiple vehicles perform local training and send model updates to a buffer (typically at a roadside unit or edge server). These updates are temporarily stored until a predefined buffer threshold is reached. Once the buffer is full, the buffered updates are aggregated and sent to a central server, which then updates the global model. The newly updated global model is broadcasted back to all participating vehicles, enabling continuous learning while preserving data privacy and reducing communication overhead. This approach enhances scalability and efficiency in federated learning for vehicular networks.
Results and discussions
Dataset
The Canadian Institute for Cybersecurity produced the CICIDS2017 dataset, which is a popular database for network detection of intrusions. It includes a range of attack scenarios that reflect actual risks to cybersecurity, including DDoS, Brute Force, Heartbleed, Botnet, Web Attacks, and Infiltration from within the network. The dataset, which was gathered over the course of five days, included both legitimate and malicious network traffic, offering a thorough and well-rounded foundation for the development and testing of security breach detection models. Comprehensive research and precise model development require certain network traffic features, like IP addresses, port numbers, protocols, packet sizes, and timestamps, all of which are included in this data. To enable supervised learning techniques, every data point is labelled as either normal or pertaining to a certain sort of attack. The CICIDS2017 dataset is widely used by researchers to evaluate and assess the effectiveness of invasion for intrusion detection, particularly those based on machine learning and deep learning, because of its broad coverage and diversity of attacks. Because it is accessible to the general public, academics and practitioners who want to create and verify novel solutions for cybersecurity frequently choose it.
Experimental hardware setup
To ensure transparency and reproducibility of our results, we provide the details of the hardware environment used for model training, cryptographic evaluation, and simulation. All experiments were conducted using the MATLAB R2022a environment on a high-performance workstation configured as shown in Table 2:
GPU acceleration was selectively used for deep learning models (MLP, CNN, RNN) to optimize training time. Cryptographic operations and hash generation were performed on the CPU, as they were relatively lightweight.
Performance evaluation
The results are evaluated for our proposed federated learning model for mitigating the leakage of data on the basis of dataset.This research conducts thorough tests to assess the effectiveness of the federated learning approach for security detection and hash code generation, in addition to verify the suggested model. Through research studies on the CICIDS2017 dataset, the suggested model has been implemented on the MATLAB work environment. WiFi connections utilise common protocols like HTTP, HTTPS, FTP, SSH, and email protocols for data transmission from source to destination in vehicular networks, including vehicle-to-vehicle and vehicle-to-infrastructure communication. A labelled network stream with the source, destination ports and IP address, timestamp, and protocols included, along with the entire packet payload in PCAP format, are all included in the CICIDS2017 dataset. In this, 80% of the data is meant for training and 20% of the data is meant for testing.
Table 3 and Fig. 7 shows the amount of time required to generate hash code utilising the suggested method, DF-HAVAL, and compares it to the other hash code techniques, Message-Digest 5 (MD5), Secure Hash Algorithm 2, and HAVAL (SHA-2). The outcome of the analysis indicates that the suggested method takes fewer minutes than the other ways because digit folding improves HAVAL. The process of folding digits aids the system in producing a condensed digest from any length of message. The use of this compression technique also aids in lowering the hash code production time. The suggested method has a hash time of 700 ms, which is smaller than the previous approaches. As a result, the analysis showed that the suggested approaches perform better than alternative approaches.
Next, we assessed the suggested technique’s encryption and decryption by contrasting it with existing cryptosystems including RSA, DES, and ECC as shown in Tables 4 and 5. The Figs. 8 and 9 analyses the encryption and decryption times of the suggested and current methods. Because the public key functioning in the current ECC is less rapid the information of the vehicle is encrypted and decrypted using the suggested technique, which demonstrates the efficient and less time-consuming property than the existing ECC. In a similar vein, RSA and other crypto techniques take a while to sign and decode data and can be challenging to safely deploy. It causes the system to run too slowly. As a result, employing the suggested method, the encryption and decryption times for the data of fifty vehicles are 4045 and 4066 ms, correspondingly.
The hands-on examination of the suggested intrusion detection system in relation to the dataset’s performance indicators is shown in Table 6 and Fig. 10. Using machine learning algorithms like MLP, RNN, and CNN as well as various quality metrics, the suggested federated learning approach is evaluated. Compared to the other models, the suggested method forecasts the attack with an accuracy of 98.35%. Comparably, the suggested models have produced greater recall and precision values. The precision and recall of the suggested approach are 97.52% and 98.47%, correspondingly. It follows that the suggested model performs better for attack detection based on the metrics.
Comparative performance analysis of federated learning techniques across key evaluation metrics—accuracy, precision, and recall. The proposed method (FedBuff + ECC) demonstrates significantly improved performance compared to traditional approaches such as output perturbation, objective perturbation, bit-choosing algorithms, and encryption-based resource allocation.
The comparative performance graph shown in Fig. 11 and Table 7 illustrates the accuracy, precision, and recall of four federated learning approaches for privacy-preserving intrusion detection. Among the methods, the proposed FedBuff + ECC framework significantly outperforms traditional techniques such as output perturbation (Ref22), objective perturbation (Ref23), bit-choosing algorithm (Ref24), and encryption-based resource allocation (Ref25). While the conventional approaches demonstrate moderate effectiveness with accuracy ranging from 88% to 92%, the proposed method achieves a notably higher accuracy of 98.35%, along with precision and recall values exceeding 97%. This indicates that the proposed system not only detects intrusions more accurately but also reduces false positives and false negatives, making it a more robust and reliable solution for securing vehicular cyber-physical systems.
The False Alarm Rate (FAR) comparison graph shown in Fig. 12 highlights the effectiveness of different federated learning techniques in minimizing false positives during intrusion detection. Lower FAR values indicate better performance, as fewer normal events are incorrectly flagged as attacks. Among the methods compared—output perturbation (Ref [22]), objective perturbation (Ref [23]), bit-choosing (Ref [24]), and encryption-based resource allocation (Ref [25])—the FAR ranges from 0.10 to 0.15, reflecting moderate susceptibility to false alerts. In contrast, the proposed FedBuff + ECC approach achieves a significantly lower FAR of 0.02, demonstrating its superior ability to accurately distinguish between normal and malicious behavior. This low FAR enhances the reliability of the system, making it particularly well-suited for real-time applications in vehicular cyber-physical systems where false alarms can lead to unnecessary disruptions or safety concerns.
Challenges faced by the proposed model
Many IoT devices are frequently linked to the Internet, which surely raises the security threats considerably in IoT. Nowadays, the majority of the different security breach techniques are carried out online, and they often have an excellent rate of success. Numerous risks to data security and privacy leakage affect FL systems within this context. There are several challenges as depicted in Fig. 13 posed by the federated learning architecture that improves the security of data in vehicular cyber physical systems:
-
Challenge: Huge amount of data Massive amounts of structured and unstructured data are produced by vehicle CPS. The limited computation and processing capacities of vehicles and RSUs make it challenging to make the best use of the data that is accessible. Solution: Federated learning is used to spread processing of data over several cars and RSUs29. Every vehicle handles its own data locally, which lessens the load on central computers and eliminates the need for large-scale data transmission. By processing data nearer to its source, edge computing and federated learning can improve response times and lighten the burden on centralised servers.
-
Challenge: Security concerns Vehicle operational parameters and other extremely sensitive data are handled by vehicle CPS. Security failures in RSUs and automobiles can lead to data loss, illegal accessibility, and serious consequences like collisions and monetary losses30. Solution: Ensuring safe data transfer between vehicles, RSUs, and base stations requires the implementation of strong encryption techniques like ECC and the DF-HAVAL algorithm. Continuous vulnerability evaluations and inspections of security aid in locating and reducing any security threats.
-
Challenge: Risks of leakage of information The loss of information is made more likely by the procedures of content caching and data transfer across vehicles31. The loss of information can be significant in the event of RSU or vehicle intrusions. Solution: By introducing noise into the data, the Laplace process in federated learning protects privacy and lowers the possibility of data leaks32. The chance of data vulnerability is reduced by making sure that only aggregated model updates—rather than raw data—are communicated.
-
Challenge: Maintaining transparency It is critical to protect the confidentiality of data when doing local model changes and global model compilation33. Solution: To make sure that individual data points cannot be linked back to their primary sources, differential privacy measures should be applied during local model changes. Data safety and confidentiality are improved during model consolidation through the implementation of secure multi-party computation techniques.
-
Challenge: Adequate model training Crucial and technologically complex applications, like distributing resources and unmanned mobility, require an efficient model training procedure34. Solution: By putting buffered asynchronous aggregation into practice, one can efficiently train and update the global model, guaranteeing that it is continually enhanced without experiencing long delays35. The federated learning process’s integration speed as well as precision are enhanced by the use of adjustable learning rates.
Through the use of suggested solutions, the federated learning framework improves vehicular CPS safety, privacy, and effectiveness. This strategy makes use of reliable communication protocols, excellent management of resources, and local processing of information to build an adaptable and resilient system that can meet the needs of contemporary transportation systems.
Use case scenario
The paper outlines various applications of FL framework that are meant to enhance VCPS security of data. These scenarios show how FL may improve privacy and security of data while preserving effective real-time data analysis for VCPS. The Laplace mechanism for confidentiality of data, model consolidation, local updates to models, and global model development are the main elements of the suggested methodology. The following use cases are explained and shown in Fig. 14:
-
The use case scenario centres on the integration of vehicle cyber-physical systems into transportation in order to enhance productivity, security, and amusement36. The suggested method makes use of federated learning to find instances of data leaking within a VCPS setting. For content storage and data this system offers V2V, V2R, and V2B connections37.
-
Vehicles converse with base stations, RSUs, and one another in this situation. RSUs receive data from the vehicles and store it, after which they distribute it to other RSUs and the base station. In order to reduce the possibility of data leakage during content caching and exchange, the main objective is to transform raw data into modified representations38.
-
Federated learning39 is used to accomplish this, which makes it possible to convert unprocessed data from several sources into learnt data models while maintaining anonymity. Using its data, each vehicle trains a local model that is updated without disclosing raw data. The global model is updated by securely aggregating the updates and sending them to a central server. In order to guarantee the effectiveness and scalability of the model, this procedure employs a buffered asynchronous aggregation mechanism.
-
The advantages of this strategy include improved traffic control, security and effectiveness in transportation, and the use of 5G, wireless connectivity, and mobile identity improvements for enhanced vehicular CPS. Furthermore, safe aggregation methods and federated learning provide data protection and confidentiality.
The possibility of federated learning to improve the safety and effectiveness of vehicle cyber-physical systems, resulting in safer and more dependable transportation networks, is demonstrated by these use examples and the comprehensive approach.
Conclusion and future scope
Conclusion
In this study, we demonstrate a secure VCPS using a federated learning system with elliptic curve cryptography. The two main stages of the suggested model are assault monitoring and detection. Federated learning is used in the stage of monitoring to recognise and categorise assaults. This stage makes use of sophisticated machine learning models in conjunction with federated learning to precisely identify various security risks within the vehicle network. ECC and IPP-ROT algorithms are used in the detection phase to handle safe verification and guarantee strong encryption for data transfers. Secure transmission of data is ensured by ECC, while identity-based verification and optimised time-bound credential revocation are managed by IPP-ROT. The suggested method made use of the CICIDS2017 data set, which could be accessed by the general public. To confirm the efficacy of this method, it was then put through a series of experimental studies, performance analyses, and comparative analyses utilising a variety of metrics.
Future scope
The study provides a number of directions for further investigation and growth. Here are a few crucial areas for further expansion:
-
To further improve the privacy of data sharing in vehicle CPS, future research can investigate the integration of more sophisticated cryptographic algorithms. The development and integration of quantum-resistant techniques is necessary in light of the emergence of quantum computation in order to guard against potential future threats.
-
In order to effectively manage a greater number of vehicles and RSUs, research can concentrate on improving the aggregation techniques employed in federated learning. Durability can be further enhanced by creating flexible allocation of resources plans that adjust to shifting traffic densities and network conditions.
-
Decision-making in real time can be improved and delay can be decreased by combining edge AI capabilities with federated learning. investigating hybrid structures that enhance system efficiency by balancing the load and combining cloud and edge computing technologies.
-
Subsequent investigations may focus on improving differential privacy methods to offer more robust assurances while preserving model precision. Additional security for privacy can be achieved by enabling analyses on encrypted data through the use of homomorphic encryption in the federated learning process.
-
Using 5G innovations and investigating next wireless communication technologies (such as 6G) to enhance the exchange of information in vehicles for CPS in terms of speed, dependability, and security. creating flexible communication protocols that can react quickly to changing network circumstances and vehicle speeds.
Data availability
Data has been taken from publicly available source: https://www.kaggle.com/datasets/chethuhn/network-intrusion-dataset
References
Kukreja, V., Kumar, D., Bansal, A. & Solanki, V. Recognizing wheat aphid disease using a novel parallel real-time technique based on mask scoring rcnn. In 2022 2nd international conference on advance computing and innovative technologies in engineering (ICACITE), 1372–1377 (IEEE, 2022).
Kumar, D. & Kukreja, V. N-cnn based transfer learning method for classification of powdery mildew wheat disease. In 2021 international conference on Emerging Smart Computing and Informatics (ESCI), 707–710 (IEEE, 2021).
Consul, P., Budhiraja, I., Chaudhary, R. & Kumar, N. Security reassessing in uav-assisted cyber-physical systems based on federated learning. In MILCOM 2022-2022 IEEE Military Communications Conference (MILCOM), 61–65 (IEEE, 2022).
Li, J., Yan, T. & Ren, P. Vfl-r: a novel framework for multi-party in vertical federated learning. Appl. Intell. 53, 12399–12415 (2023).
Gaba, S., Budhiraja, I., Kumar, V., Garg, S. & Hassan, M. M. An innovative multi-agent approach for robust cyber-physical systems using vertical federated learning. Ad Hoc Netw. 163, 103578 (2024).
Ferrag, M. A., Friha, O., Maglaras, L., Janicke, H. & Shu, L. Federated deep learning for cyber security in the internet of things: Concepts, applications, and experimental analysis. IEEE Access 9, 138509–138542 (2021).
Babbar, H. & Rani, S. Frhids: Federated learning recommender hybrid intrusion detection system model in software defined networking for consumer devices. IEEE Trans. Consum. Electron. 70(1), 2492–9 (2023).
Liu, H., Li, B., Xie, P. & Zhao, C. Privacy-encoded federated learning against gradient-based data reconstruction attacks. IEEE Trans. Inf. Forensics Secur. 18, 5860–75 (2023).
Cai, L., Hu, Q., Jiang, T. & Niyato, D. Blockchain-enabled secure federated learning for digital twin networks. IEEE Wirel. Commun. 24, 86–101 (2024).
Miao, Y., Liu, Z., Li, H., Choo, K.-K.R. & Deng, R. H. Privacy-preserving byzantine-robust federated learning via blockchain systems. IEEE Trans. Inf. Forensics Secur. 17, 2848–2861 (2022).
Muazu, T. et al. A federated learning system with data fusion for healthcare using multi-party computation and additive secret sharing. Comput. Commun. 216, 168–182 (2024).
Hu, Q., Li, B., Zhou, J. & Jiang, T. Fine-grained anonymous access for satellite communication using zero knowledge proof. IEEE Trans. Veh. Technol. 74, 9921–9926 (2025).
Li, D. et al. Ubiquitous intelligent federated learning privacy-preserving scheme under edge computing. Future Gener. Comput. Syst. 144, 205–218 (2023).
Rani, S., Babbar, H., Shah, S. H. A. & Singh, A. Improvement of energy conservation using blockchain-enabled cognitive wireless networks for smart cities. Sci. Rep. 12, 13013 (2022).
Li, H. et al. Review on security of federated learning and its application in healthcare. Future Gener. Comput. Syst. 144, 271–290 (2023).
Lu, Y., Huang, X., Zhang, K., Maharjan, S. & Zhang, Y. Blockchain empowered asynchronous federated learning for secure data sharing in internet of vehicles. IEEE Trans. Veh. Technol. 69, 4298–4311 (2020).
Lian, Z. et al. Deep-fel: Decentralized, efficient and privacy-enhanced federated edge learning for healthcare cyber physical systems. IEEE Trans. Netw. Sci. Eng. 9, 3558–3569 (2022).
Huang, X., Han, L., Li, D., Xie, K. & Zhang, Y. A reliable and fair federated learning mechanism for mobile edge computing. Comput. Netw. 226, 109678 (2023).
Jiang, Y., Zhang, W. & Chen, Y. Data quality detection mechanism against label flipping attacks in federated learning. IEEE Trans. Inf. Forensics Secur. 18, 1625–1637 (2023).
Elbir, A. M., Soner, B., Çöleri, S., Gündüz, D. & Bennis, M. Federated learning in vehicular networks. In 2022 IEEE International Mediterranean Conference on Communications and Networking (MeditCom), 72–77 (IEEE, 2022).
Hou, J., Su, M., Fu, A. & Yu, Y. Verifiable privacy-preserving scheme based on vertical federated random forest. IEEE Internet of Things J. 9, 22158–22172 (2021).
Du, M., Wang, K., Chen, Y., Wang, X. & Sun, Y. Big data privacy preserving in multi-access edge computing for heterogeneous internet of things. IEEE Commun. Mag. 56, 62–67 (2018).
Du, M., Wang, K., Xia, Z. & Zhang, Y. Differential privacy preserving of training model in wireless big data with edge computing. IEEE Trans. Big Data 6, 283–295 (2018).
Yu, J., Wang, K., Zeng, D., Zhu, C. & Guo, S. Privacy-preserving data aggregation computing in cyber-physical social systems. ACM Trans. Cyber-Phys. Syst. 3, 1–23 (2018).
Li, H., Wang, K., Liu, X., Sun, Y. & Guo, S. A selective privacy-preserving approach for multimedia data. IEEE Multimed. 24, 14–25 (2017).
Vijayakumar, P., Azees, M., Kannan, A. & Deborah, L. J. Dual authentication and key management techniques for secure data transmission in vehicular ad hoc networks. IEEE Trans. Intell. Transp. Syst. 17, 1015–1028 (2015).
Majidi, S. H. & Asharioun, H. Privacy preserving federated learning solution for security of industrial cyber physical systems. AI-Enabled Threat Detection and Security Analysis for Industrial IoT, 195–211 (2021).
Zhang, C., Liu, X., Zheng, X., Li, R. & Liu, H. Fenghuolun: A federated learning based edge computing platform for cyber-physical systems. In 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), 1–4 (IEEE, 2020).
Bogdanov, D., Laur, S. & Willemson, J. Sharemind: A framework for fast privacy-preserving computations. In Computer Security-ESORICS 2008: 13th European Symposium on Research in Computer Security, Málaga, Spain, October 6–8, 2008. Proceedings 13, 192–206 (Springer, 2008).
Cao, Y., Zhang, J., Zhao, Y., Su, P. & Huang, H. Srfl: A secure & robust federated learning framework for iot with trusted execution environments. Expert Syst. Appl. 239, 122410 (2024).
Du, J., Qin, N., Huang, D., Zhang, Y. & Jia, X. An efficient federated learning framework for machinery fault diagnosis with improved model aggregation and local model training. IEEE Trans. Neural Netw. Learn. Syst. 35(7), 10086–10097 (2023).
Fan, J., Wang, X., Guo, Y., Hu, X. & Hu, B. Federated learning driven secure internet of medical things. IEEE Wirel. Commun. 29, 68–75 (2022).
Fan, M., Ji, K., Zhang, Z., Yu, H. & Sun, G. Lightweight privacy and security computing for blockchained federated learning in iot. IEEE Internet of Things J. 10, 16048–16060 (2023).
Fang, C., Guo, Y., Ma, J., Xie, H. & Wang, Y. A privacy-preserving and verifiable federated learning method based on blockchain. Comput. Commun. 186, 1–11 (2022).
Guo, J. et al. Adfl: A poisoning attack defense framework for horizontal federated learning. IEEE Trans. Ind. Inform. 18, 6526–6536 (2022).
Li, H., Ge, L. & Tian, L. Survey: federated learning data security and privacy-preserving in edge-internet of things. Artif. Intell. Rev. 57, 130 (2024).
Yao, A., Pal, S., Dong, C., Li, X. & Liu, X. A framework for user biometric privacy protection in uav delivery systems with edge computing. In 2024 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), 631–636 (IEEE, 2024).
Yao, A. et al. A privacy-preserving location data collection framework for intelligent systems in edge computing. Ad Hoc Netw. 161, 103532 (2024).
Bhagoji, A. N., Chakraborty, S., Mittal, P. & Calo, S. Analyzing federated learning through an adversarial lens. In International conference on machine learning, 634–643 (PMLR, 2019).
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
S.R. and H.B. conceived the experiment(s), S.R., and H.B. conducted the experiment(s), and S.R. analyzed the results. M.S has validated and supervised the work. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
Not applicable.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Babbar, H., Rani, S. & Shabaz, M. Federated learning with enhanced cryptographic security for vehicular cyber-physical systems. Sci Rep 15, 28593 (2025). https://doi.org/10.1038/s41598-025-14341-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-14341-0