Introduction

The Vehicular cyber-physical systems (CPS) combine physical and computational resources to communicate with one another and the outside world in order to enhance entertainment reliability, security, and effectiveness in transportation. Vehicles in the Vehicular-CPS are able to analyse traffic data onsite and exchange accurate data with other vehicles via vehicular networks. Applying intelligent video content protection systems has been made possible by recent technological breakthroughs in areas like mobile detecting, 5G, and wireless connectivity. Many different types of data, both structured and unstructured, are usually produced by Vehicular-CPSs. The effective utilisation of accessible data by automobiles and RSU’s is hindered by limited computation and caching resources. To tackle these problems, multi-access edge computing, or MEC, is a viable model1,2.

Through centrally computing in vehicles or RSUs in a Vehicular CPS, MEC simplifies and allows growing computational-intensive and time-sensitive usage, like autonomous transportation, allocation of resources3,4, and offloading computation5. Artificial intelligence (AI) is a potent tool capable of making precise forecasts and handling challenging issues. AI algorithms are frequently used in Vehicular CPS to address difficult issues. Specifically, AI algorithms for sharing information and content caching have helped to lower costs and improve the usefulness of Vehicular CPS. For example, in6, deep reinforcement learning was used to optimise cache and compute resources together, improving system utility. The deep-reinforcement-learning-based technique used in5 maximised the system utility of device-to-device content caching.

Yet, the majority of existing efforts concentrate on sharing information and allocation of resources. Highly confidential information, such as the operating condition of vehicles, may be present in the Vehicular CPS data. Vehicles and RSUs which handle and cache this data as servers at the edge are susceptible to security breaches. The uninvited disclosure of information to unapproved parties is referred to as the threat of data loss7. In the event that the RSUs and vehicles are compromised or turn hostile, there might be a significant data leak that causes accidents and significant financial loss. The risk of data loss might be further increased by content caching and data exchange across cars. In Vehicular CPS, there are still many unanswered questions regarding data leak detection, security threat mitigation, and security modelling.

Fig. 1
figure 1

Model for identification of data leakage in vehicular CPS.

The Fig. 1, in VCPS, a communication model is created to detect data leaks. Our suggested approach for detecting leakage of data, in which RSUs cache resources from cars, other RSUs, and the base station. Vehicle-to-BS (V2B), vehicle-to-road (V2R), and vehicle-to-vehicle (V2V) communication are used to provide information for content caching along with data exchange. Two integrated elements of the suggested model—intelligent data transformation and cooperative data leakage identification across both RSUs and vehicles—address the aforementioned issue8,9. The primary function of adaptive transformation of data is to translate unprocessed data into modified formats in order to reduce the possibility of data leakage during subsequent steps like content caching and data exchange. Since many cars often give the desired data in application scenarios like vehicular data sharing, collaborative mapping across multiple vehicles is necessary to guarantee the usability of the data after processing. Here, we use federated learning to translate raw data from several parties into learnt data models10,11, hence learning the data features across multiple parties. Further enhancing confidentiality, the learnt model includes legitimate information to be used in other activities like resource allocation without disclosing their raw data12.

The detailed description of each component is explained below:

  1. 1.

    Vehicles to vehicles transmission: Vehicle-to-vehicle (V2V) communication is a means by which various vehicles can interact with one another13. Every vehicle gathers and analyses its own local data, builds a local model, and merely exchanges updated models with other vehicles and roadside units.

  2. 2.

    V2R and R2R communication via roadside units: RSUs serve as facilitators, exchanging and caching data obtained from vehicles. Roadside-to-roadside (R2R) connectivity is used by RSUs to exchange data with one another and to acquire data from other vehicles. RSUs deliver the combined data to the central base station after aggregating the local updates to models that are obtained from vehicles.

  3. 3.

    R2B communication base station: The roadside-to-base station (R2B) method of communication allows the base station to acquire collected information from several RSUs. These collected updates are combined by the base station to produce a global model, which is subsequently returned to the vehicles and RSUs for additional improvement.

Motivation

Through the integration of both physical and computational capabilities, vehicular-CPS are leading the way in revolutionising transportation by improving overall efficiency, safety, trustworthiness, and value as entertainment. Through the use of vehicular networks and instantaneous traffic analysis, these systems greatly enhance congestion control and safety by allowing vehicles to communicate precise information among other vehicles. Vehicular CPS has been further boosted by recent technology breakthroughs in mobile identification, 5G, and wireless communication, opening the door for complex applications like intelligent video content security systems14.

But there are a lot of obstacles because of the massive volumes of structured and unstructured data that Vehicle CPS generates. Vehicles and RSUs that have restricted computing and caching capacities find it difficult to make effective use of the data that is accessible15. In order to overcome these obstacles, MEC shows promise as a model for centralised computing in vehicles or RSUs. Applications that are time-sensitive and computationally challenging, like allocation of resources, compute offloading, and autonomous transportation, are supported by MEC. By managing complicated problems and producing accurate projections, AI significantly improves vehicular CPS, especially in the areas of content caching and information exchange.

Problem definition

Even though vehicular CPS has made great strides and has many advantages, there are important security issues that must be resolved. Highly sensitive data, including vehicle operating parameters, is frequently handled by these devices. As servers at the edge, RSUs and vehicles can experience security lapses that could result in data loss. Sensitive material disclosed without authorization to unapproved parties may have serious repercussions, such as mishaps and large financial losses16.

The processes of content caching and data interchange between vehicles increase the danger of data loss. There might be a big data loss in the case of an RSU or vehicle breach or aggressive acquisition17. The majority of present investigations concentrate on the distribution of resources as well as information exchange18,19, leaving significant gaps in our knowledge of and ability to mitigate risks to security, loss of data, and safety modelling within Vehicular CPS. Thus, by examining efficient techniques for leakage of data detection and security threat prevention, and thorough security modelling in Vehicle CPS, this study seeks to bridge these gaps20. Through doing this, we hope to improve these systems’ overall durability and safety, guaranteeing their safe and dependable operation.

Contributions

The main contributions are summarized as :

  1. 1.

    With a focus on privacy and security of data, we present an extensive federated learning architecture designed for automotive cyber-physical systems. This structure preserves the localization of private information while allowing it to gain from collective model updates.

  2. 2.

    Using locally acquired data, the process entails training local models inside each vehicle. Keeping data private requires taking this step because raw data is never taken out of the vehicle itself. Updates to the local models are safely distributed and combined to improve the model as a whole.

  3. 3.

    We suggest a technique for combining locally updated models from different cars. This aggregating model improves the global model while protecting the privacy of individual car data by capturing federated learning from all participating vehicles.

  4. 4.

    We present the FedBuff architecture, which first buffers local model updates before forwarding them to the central server. FedBuff stands for Federated Learning with Buffered Aggregation. This aggregation improves model training effectiveness and adaptability.

  5. 5.

    The CICIDS2017 dataset is used to assess the suggested framework. Our methodology is shown to be more efficient and secure when compared to other approaches such as MD5, SHA-2, RSA, and DES, based on performance parameters like hash code generation time and encryption/decryption times.

Literature review for federated learning techniques for preserving security of data in vehicular cyber-physical systems

In order to improve entertainment21, effectiveness, and security in transportation, VCPS integrates physical and computational resources. They converse over vehicle networks and process traffic data locally. Developments in 5G, wireless networking, and remote sensors have sped up the use of sophisticated VCPS. Because of their constrained computing and caching capabilities, these systems produce enormous volumes of both organised and unorganised information, which presents problems for cars and roadside units (RSUs). approaches for differential privacy throughout training, such as output and objective disruption (e.g.,22,23). Bit-choosing algorithms (e.g.,24) for aggregated data that safeguard the confidentiality of users. Allocating encryption resources according to implementation time and security weight (e.g.,25). The Table 1 shows the comparative analysis:

Table 1 Existing literature on privacy issues in wireless networks and cyber-physical systems.

Preliminaries

Federated learning

In FL issues, data from numerous remote devices is used to train a single global statistical model. Figure 2 illustrates a conventional FL design. The objective of FL is to analyse the data collected by the device to train the model within the limitations of local caching. It also aims to update the parameters of the model on a regular basis so that they can be exchanged with the cloud parameter server. Stated otherwise, the purpose is to minimise the average loss of training for every customer, which is represented by the subsequent objective function in Eq. 1:

Fig. 2
figure 2

Federated learning design.

$$\begin{aligned} _d^{minimum} G (d), G (d) \sum _{m=1}^k q_m G_m(d) \end{aligned}$$
(1)

in which, k signifies the total devices that are participating in the training the model; \(q_m\) k defines the relative weight of impact assigned to every individual device; \(G_m(d)\) defines the localized objective function of the mth device; \(G_m(d)\) is defined in Eq. 2:

$$\begin{aligned} G_m(d) = \frac{1}{b_m} \sum _{a=1}^{b_m} g_a (d, r_a, w_a) \end{aligned}$$
(2)

in which \(b_m\) denotes the large volume of data of the mth device; \(g_a(d, r_a, w_a)\) denotes the loss function of the proposed model having the parameter d on the instance \((r_a, w_a)\) in the mth device of localized dataset.

Methodology

The proposed methodology comprise of various steps to execute federated learning in vehicular-CPS that ensures the privacy of data. The Fig. 3 shows the workflow of the proposed methodology. The steps included are:

Fig. 3
figure 3

Secure federated learning workflow in vehicular networks.

Fig. 4
figure 4

Proposed federated learning.

The Fig. 4 demonstrates a federated learning framework for vehicular networks, in which every car gathers and trains machine learning models on its own local data without disclosing the unprocessed information, thus protecting privacy. A central coordinating entity, such as a server or RSU, is depicted at the middle of the design and receives the locally trained model updates (represented by green neural icons). The changes are combined by this server to create a global model, which is subsequently sent back to each participant car. In intelligent transportation systems, this periodic approach supports safe, scalable, and real-time learning while maintaining the privacy of data and facilitating cooperative model improvement throughout the network.

  1. 1.

    Local model updates: In a vehicular cyber-physical system, for example, every participant trains a machine learning model using local data, a process known as the local model updates. Subsequently, the modified model parameters are disseminated to a central server or to other people participating directly, who combine the changes to enhance an overall model. This procedure protects each participant’s local data privacy while enabling the global model to gain from their combined understanding. Every vehicle \(v_i\) updates its local model \(m_i (t)\) deploying its data locally by defining \(d_i\). The update is executed by following the various steps in the direction opposite of the gradients is shown in Eq. 3 and Fig. 4:

    $$\begin{aligned} m_i(t) = m_i(t-1) - l \nabla G_i(w_t) \end{aligned}$$
    (3)

    where \(m_i (t)\) defines the model of vehicle i at time t; \(m_i(t-1)\) defines the model of vehicle i at time t-1; l is represented as the learning rate which states the hyperparameter that manages the size of step during the gradient descent; \(\nabla G_i(w_t)\) shows the gradient of the loss function at the local model; \(G_i\) in regard to the model parameter (\(w_t\)).

  2. 2.

    Laplace mechanism for preserving the data: A popular method in the realm of distinct confidentiality is the Laplace mechanism, which adds controlled noise to data or query results in order to safeguard the confidentiality of individuals. The mechanism ensures that the inclusion or removal of one point of information has no major impact on the analysis’s conclusion by leveraging the Laplace distribution to achieve confidentiality requirements. The Laplace mechanism is deployed to the local model for ensuring the privacy is shown in Eq. 4:

    $$\begin{aligned} \hat{m_i}(t) = m_i + LN(0, \sigma ^2) \end{aligned}$$
    (4)

    where \(\hat{m_i}(t)\) signifies the distorted local model of the vehicle i at the time t; \(m_i(t)\) explains the original local model of vehicle i at time t; \(LN(0, \sigma ^2)\) is the laplace noise that is added to the model in which noise is drawn from a laplace dispersion between the mean value 0 and variance \(\sigma ^2\). By hiding the precise values of the model parameters, applying noise to the model aids in the preservation of different levels of privacy. By doing this, adversaries are prevented from deducing confidential details about the training data for the model.

  3. 3.

    Aggregation of model: A selected group of individuals is shared with the vehicles’ distorted models, and every vehicle upgrades its model by combining the acquired models. The model aggregated for every vehicle is j after receiving the updates from a subset SU of individuals is shown in Eq. 5:

    $$\begin{aligned} m_j(t) = m_j(t-1) + \frac{1}{|SU|} \sum _{k \in SU}\hat{m_k}(t) \end{aligned}$$
    (5)

    where \(m_j(t)\) denotes the model updated for every vehicle j at time t; \(m_j(t-1)\) denotes the model updated for every vehicle j at time t-1; SU signifies the subset of individuals from which the vehicle j has received the distorted models; |SU| shows the number of individuals in subset SU; \(\hat{m_k}(t)\) represents the distorted model acquired from the vehicle k at the time t. By aggregating the distorted models obtained from the subset SU of individuals, the above formula updates the vehicle j model. This cooperative strategy protects data privacy while enhancing the global model.

  4. 4.

    Aggregation by global model: The global model, which represents the combined learning of all participating vehicles, is created by combining all of the local models. The aggregation of global model is represented in Eq. 6:

    $$\begin{aligned} M(t) = \sum _i m_i(t) \end{aligned}$$
    (6)

    where M(t) denotes the global model at the time t; \(m_i(t)\) explains the loal model of vehicle i at time t. To create a global model, this equation combines the local models from each participant vehicle. The global model ensures a thorough and cooperative learning process by representing the collective information from all vehicles.

Secure architectural design for vehicle cyber-physical systems to identify data leaks

By utilizing federated learning (FL), the suggested architecture aims to improve security and privacy in vehicular CPS scenarios. The architecture is designed to tackle the escalating issues of safety, confidentiality, and effective real-time data processing in vehicular networks, that have turned into vital parts of intelligent transportation systems.

System model

The system consists of a central federated server, roadside units (RSUs), and several vehicles as shown in the Fig.  5. The following are the main elements and their functions: Vehicles: Every vehicle in the vehicular network has sensors installed to gather information and identify errors. Using the data they have gathered, the vehicles do training locally and produce local updates to the models. RSUs, or roadside units: By serving as go-between, these devices help cars and the federated server communicate. They transmit the federated server the combined local updates from the cars. Federated Server: Using the combined data from RSUs, the server modifies and updates the global model. It also manages the federated learning procedure’ general coordination.

Fig. 5
figure 5

Secure workflow design for vehicle cyber-physical systems.

A strong authentication method is in place to protect transmission of information and connection within the vehicle network. Elliptic Curve Cryptography (ECC)-based cryptographic keys are mapped to publicly accessible data (such as car number, type, and colour) in order for vehicles to be associated with the federated server. The IPP-ROT technique is used to create cypher text. Hashing is done with the DF-HAVAL algorithm for vehicle identification in order to avoid data collisions and gearbox delays. In this, two keys are generated i.e. public and private by deploying the ECC technique and two random numbers (X,Y) are chosen from ECC Eq. 7 that is shown as:

$$\begin{aligned} B^2 = A^3 + VA + C \end{aligned}$$
(7)

Later, the random number \(\lambda\) is chosen in the range between [1, m-1] and later the public keys are computed by employing as, \(V=\beta .L\), where, \(\beta\) is the private key and L signifies the point showing on the curve. Based on these keys the cipher text is generated deploying IPP-ROT technique. Therefore, the public key V and private key \(\beta\) is acquired to create the cipher text. Logins are essentially just security precautions used to prevent unwanted access to private information. The person using the account is not permitted to connect to the server if the login attempt is unsuccessful or if the username and password are different27. Using the username, password, and cypher text, the user logs in to the federated server during this step. In the event that the login succeeds, the user’s identity is validated in the subsequent stage. Verification is a crucial process that is primarily used to prevent unauthorised people from obtaining data linked to vehicles, therefore maintaining the quality of the system that is suggested. In the next stage, when the user has been validated, some parameters are retrieved.

An authentication procedure begins as soon as the car detects the data. In order for authentication of each car with the RSUs, hash creation is done. Because vehicles are travelling and crossing the RSUs simultaneously, this is done to prevent data collisions and gearbox delays28. As a result, following data sensing, hash codes are produced utilising a vehicle’s geolocation and the sender’s node ip that is created at registrations. Next, in order to determine which vehicle falls under which RSU, the produced hash code is authenticated using RSUs. The process of authenticating a remote vehicle involves confirming its identification before allowing it to utilise a service. By finding the appropriate RSUs for communication, this improves the efficiency of the data delivery. The work proposed deploys the DF-HAVAL algorithm for the generation of hash codes. Let us say, the geolocation \(G_a\) and sender’s node address \(NS_p\) are input to the proposed hashing technique, that is in the message form with a total length of 256 characters. At the initial stage, the input \(G_a\), \(NS_p\) are divided into \(u_b\) of 2048 bits where b = 1, 2, 3,...,n that is the total number of inputs data partitioned which also signifies the length of input data.

The FedBuff (Federated Learning with Buffered Aggregation) architecture makes sure about the protection of private information in federated learning optimization via various mechanisms. Firstly, the server stores every local model that is acquired in the buffer B, and when the buffer is full, the server updates the global model. Though, because of few privacy concerns, the buffer is insecure but still to over this, a secure aggregation protocol is developed, that permits the user to transmit the updates for the protection of privacy. This protocol turns on the server to aggregate the local updates for beyond that is stored on the server without having access to any other information regarding the users updates.

Fig. 6
figure 6

Illustration of the FedBuff architecture showing how local model updates from vehicles are temporarily stored in RSU buffers. Once the buffer threshold is met, updates are aggregated and sent to the central server for global model update and redistribution.

The FedBuff model is used to construct the federated learning architecture, which forms the basis of the suggested architecture. To maintain confidentiality, every vehicle uses its dataset to build a local model without sharing the raw data. The RSUs then get the local model changes. The local changes are buffered by the RSUs before being regularly sent to the federated server. As additional information is received, the federated server updates the global model independently, enabling scalability and effective model training. The buffered asynchronous aggregation method updates the global model. After that, the vehicles receive the updated global model again for additional local training.

The Fig. 6 illustrates the Buffered Aggregation Workflow in the FedBuff architecture, where multiple vehicles perform local training and send model updates to a buffer (typically at a roadside unit or edge server). These updates are temporarily stored until a predefined buffer threshold is reached. Once the buffer is full, the buffered updates are aggregated and sent to a central server, which then updates the global model. The newly updated global model is broadcasted back to all participating vehicles, enabling continuous learning while preserving data privacy and reducing communication overhead. This approach enhances scalability and efficiency in federated learning for vehicular networks.

Results and discussions

Dataset

The Canadian Institute for Cybersecurity produced the CICIDS2017 dataset, which is a popular database for network detection of intrusions. It includes a range of attack scenarios that reflect actual risks to cybersecurity, including DDoS, Brute Force, Heartbleed, Botnet, Web Attacks, and Infiltration from within the network. The dataset, which was gathered over the course of five days, included both legitimate and malicious network traffic, offering a thorough and well-rounded foundation for the development and testing of security breach detection models. Comprehensive research and precise model development require certain network traffic features, like IP addresses, port numbers, protocols, packet sizes, and timestamps, all of which are included in this data. To enable supervised learning techniques, every data point is labelled as either normal or pertaining to a certain sort of attack. The CICIDS2017 dataset is widely used by researchers to evaluate and assess the effectiveness of invasion for intrusion detection, particularly those based on machine learning and deep learning, because of its broad coverage and diversity of attacks. Because it is accessible to the general public, academics and practitioners who want to create and verify novel solutions for cybersecurity frequently choose it.

Experimental hardware setup

To ensure transparency and reproducibility of our results, we provide the details of the hardware environment used for model training, cryptographic evaluation, and simulation. All experiments were conducted using the MATLAB R2022a environment on a high-performance workstation configured as shown in Table 2:

Table 2 Hardware and software configuration used for experiments.

GPU acceleration was selectively used for deep learning models (MLP, CNN, RNN) to optimize training time. Cryptographic operations and hash generation were performed on the CPU, as they were relatively lightweight.

Performance evaluation

The results are evaluated for our proposed federated learning model for mitigating the leakage of data on the basis of dataset.This research conducts thorough tests to assess the effectiveness of the federated learning approach for security detection and hash code generation, in addition to verify the suggested model. Through research studies on the CICIDS2017 dataset, the suggested model has been implemented on the MATLAB work environment. WiFi connections utilise common protocols like HTTP, HTTPS, FTP, SSH, and email protocols for data transmission from source to destination in vehicular networks, including vehicle-to-vehicle and vehicle-to-infrastructure communication. A labelled network stream with the source, destination ports and IP address, timestamp, and protocols included, along with the entire packet payload in PCAP format, are all included in the CICIDS2017 dataset. In this, 80% of the data is meant for training and 20% of the data is meant for testing.

Table 3 and Fig. 7 shows the amount of time required to generate hash code utilising the suggested method, DF-HAVAL, and compares it to the other hash code techniques, Message-Digest 5 (MD5), Secure Hash Algorithm 2, and HAVAL (SHA-2). The outcome of the analysis indicates that the suggested method takes fewer minutes than the other ways because digit folding improves HAVAL. The process of folding digits aids the system in producing a condensed digest from any length of message. The use of this compression technique also aids in lowering the hash code production time. The suggested method has a hash time of 700 ms, which is smaller than the previous approaches. As a result, the analysis showed that the suggested approaches perform better than alternative approaches.

Table 3 Based on hash code generation.
Fig. 7
figure 7

Comparative analysis of hash code generation.

Table 4 Comparison of Encryption Performance Across Different Cryptography Methods.
Fig. 8
figure 8

Comparison of encryption performance across different cryptography methods.

Table 5 Comparison of Decryption Performance Across Different Cryptography Methods.
Fig. 9
figure 9

Comparison of decryption performance across different cryptography methods.

Next, we assessed the suggested technique’s encryption and decryption by contrasting it with existing cryptosystems including RSA, DES, and ECC as shown in Tables 4 and 5. The Figs. 8 and 9 analyses the encryption and decryption times of the suggested and current methods. Because the public key functioning in the current ECC is less rapid the information of the vehicle is encrypted and decrypted using the suggested technique, which demonstrates the efficient and less time-consuming property than the existing ECC. In a similar vein, RSA and other crypto techniques take a while to sign and decode data and can be challenging to safely deploy. It causes the system to run too slowly. As a result, employing the suggested method, the encryption and decryption times for the data of fifty vehicles are 4045 and 4066 ms, correspondingly.

Table 6 Performance Metrics.
Fig. 10
figure 10

Performance analysis of proposed approach with existing approaches.

The hands-on examination of the suggested intrusion detection system in relation to the dataset’s performance indicators is shown in Table 6 and Fig. 10. Using machine learning algorithms like MLP, RNN, and CNN as well as various quality metrics, the suggested federated learning approach is evaluated. Compared to the other models, the suggested method forecasts the attack with an accuracy of 98.35%. Comparably, the suggested models have produced greater recall and precision values. The precision and recall of the suggested approach are 97.52% and 98.47%, correspondingly. It follows that the suggested model performs better for attack detection based on the metrics.

Table 7 Comparative performance analysis of federated learning techniques across key evaluation metrics—accuracy, precision, and recall.
Fig. 11
figure 11

Comparative performance analysis of federated learning techniques across key evaluation metrics—accuracy, precision, and recall. The proposed method (FedBuff + ECC) demonstrates significantly improved performance compared to traditional approaches such as output perturbation, objective perturbation, bit-choosing algorithms, and encryption-based resource allocation.

The comparative performance graph shown in Fig. 11 and Table 7 illustrates the accuracy, precision, and recall of four federated learning approaches for privacy-preserving intrusion detection. Among the methods, the proposed FedBuff + ECC framework significantly outperforms traditional techniques such as output perturbation (Ref22), objective perturbation (Ref23), bit-choosing algorithm (Ref24), and encryption-based resource allocation (Ref25). While the conventional approaches demonstrate moderate effectiveness with accuracy ranging from 88% to 92%, the proposed method achieves a notably higher accuracy of 98.35%, along with precision and recall values exceeding 97%. This indicates that the proposed system not only detects intrusions more accurately but also reduces false positives and false negatives, making it a more robust and reliable solution for securing vehicular cyber-physical systems.

Fig. 12
figure 12

Comparison of False Alarm Rate (FAR) among various federated learning privacy-preserving approaches.

The False Alarm Rate (FAR) comparison graph shown in Fig. 12 highlights the effectiveness of different federated learning techniques in minimizing false positives during intrusion detection. Lower FAR values indicate better performance, as fewer normal events are incorrectly flagged as attacks. Among the methods compared—output perturbation (Ref [22]), objective perturbation (Ref [23]), bit-choosing (Ref [24]), and encryption-based resource allocation (Ref [25])—the FAR ranges from 0.10 to 0.15, reflecting moderate susceptibility to false alerts. In contrast, the proposed FedBuff + ECC approach achieves a significantly lower FAR of 0.02, demonstrating its superior ability to accurately distinguish between normal and malicious behavior. This low FAR enhances the reliability of the system, making it particularly well-suited for real-time applications in vehicular cyber-physical systems where false alarms can lead to unnecessary disruptions or safety concerns.

Challenges faced by the proposed model

Many IoT devices are frequently linked to the Internet, which surely raises the security threats considerably in IoT. Nowadays, the majority of the different security breach techniques are carried out online, and they often have an excellent rate of success. Numerous risks to data security and privacy leakage affect FL systems within this context. There are several challenges as depicted in Fig. 13 posed by the federated learning architecture that improves the security of data in vehicular cyber physical systems:

Fig. 13
figure 13

Challenges faced by the proposed model.

  • Challenge: Huge amount of data Massive amounts of structured and unstructured data are produced by vehicle CPS. The limited computation and processing capacities of vehicles and RSUs make it challenging to make the best use of the data that is accessible. Solution: Federated learning is used to spread processing of data over several cars and RSUs29. Every vehicle handles its own data locally, which lessens the load on central computers and eliminates the need for large-scale data transmission. By processing data nearer to its source, edge computing and federated learning can improve response times and lighten the burden on centralised servers.

  • Challenge: Security concerns Vehicle operational parameters and other extremely sensitive data are handled by vehicle CPS. Security failures in RSUs and automobiles can lead to data loss, illegal accessibility, and serious consequences like collisions and monetary losses30. Solution: Ensuring safe data transfer between vehicles, RSUs, and base stations requires the implementation of strong encryption techniques like ECC and the DF-HAVAL algorithm. Continuous vulnerability evaluations and inspections of security aid in locating and reducing any security threats.

  • Challenge: Risks of leakage of information The loss of information is made more likely by the procedures of content caching and data transfer across vehicles31. The loss of information can be significant in the event of RSU or vehicle intrusions. Solution: By introducing noise into the data, the Laplace process in federated learning protects privacy and lowers the possibility of data leaks32. The chance of data vulnerability is reduced by making sure that only aggregated model updates—rather than raw data—are communicated.

  • Challenge: Maintaining transparency It is critical to protect the confidentiality of data when doing local model changes and global model compilation33. Solution: To make sure that individual data points cannot be linked back to their primary sources, differential privacy measures should be applied during local model changes. Data safety and confidentiality are improved during model consolidation through the implementation of secure multi-party computation techniques.

  • Challenge: Adequate model training Crucial and technologically complex applications, like distributing resources and unmanned mobility, require an efficient model training procedure34. Solution: By putting buffered asynchronous aggregation into practice, one can efficiently train and update the global model, guaranteeing that it is continually enhanced without experiencing long delays35. The federated learning process’s integration speed as well as precision are enhanced by the use of adjustable learning rates.

Through the use of suggested solutions, the federated learning framework improves vehicular CPS safety, privacy, and effectiveness. This strategy makes use of reliable communication protocols, excellent management of resources, and local processing of information to build an adaptable and resilient system that can meet the needs of contemporary transportation systems.

Use case scenario

The paper outlines various applications of FL framework that are meant to enhance VCPS security of data. These scenarios show how FL may improve privacy and security of data while preserving effective real-time data analysis for VCPS. The Laplace mechanism for confidentiality of data, model consolidation, local updates to models, and global model development are the main elements of the suggested methodology. The following use cases are explained and shown in Fig. 14:

Fig. 14
figure 14

Use case scenarios.

  • The use case scenario centres on the integration of vehicle cyber-physical systems into transportation in order to enhance productivity, security, and amusement36. The suggested method makes use of federated learning to find instances of data leaking within a VCPS setting. For content storage and data this system offers V2V, V2R, and V2B connections37.

  • Vehicles converse with base stations, RSUs, and one another in this situation. RSUs receive data from the vehicles and store it, after which they distribute it to other RSUs and the base station. In order to reduce the possibility of data leakage during content caching and exchange, the main objective is to transform raw data into modified representations38.

  • Federated learning39 is used to accomplish this, which makes it possible to convert unprocessed data from several sources into learnt data models while maintaining anonymity. Using its data, each vehicle trains a local model that is updated without disclosing raw data. The global model is updated by securely aggregating the updates and sending them to a central server. In order to guarantee the effectiveness and scalability of the model, this procedure employs a buffered asynchronous aggregation mechanism.

  • The advantages of this strategy include improved traffic control, security and effectiveness in transportation, and the use of 5G, wireless connectivity, and mobile identity improvements for enhanced vehicular CPS. Furthermore, safe aggregation methods and federated learning provide data protection and confidentiality.

The possibility of federated learning to improve the safety and effectiveness of vehicle cyber-physical systems, resulting in safer and more dependable transportation networks, is demonstrated by these use examples and the comprehensive approach.

Conclusion and future scope

Conclusion

In this study, we demonstrate a secure VCPS using a federated learning system with elliptic curve cryptography. The two main stages of the suggested model are assault monitoring and detection. Federated learning is used in the stage of monitoring to recognise and categorise assaults. This stage makes use of sophisticated machine learning models in conjunction with federated learning to precisely identify various security risks within the vehicle network. ECC and IPP-ROT algorithms are used in the detection phase to handle safe verification and guarantee strong encryption for data transfers. Secure transmission of data is ensured by ECC, while identity-based verification and optimised time-bound credential revocation are managed by IPP-ROT. The suggested method made use of the CICIDS2017 data set, which could be accessed by the general public. To confirm the efficacy of this method, it was then put through a series of experimental studies, performance analyses, and comparative analyses utilising a variety of metrics.

Future scope

The study provides a number of directions for further investigation and growth. Here are a few crucial areas for further expansion:

  • To further improve the privacy of data sharing in vehicle CPS, future research can investigate the integration of more sophisticated cryptographic algorithms. The development and integration of quantum-resistant techniques is necessary in light of the emergence of quantum computation in order to guard against potential future threats.

  • In order to effectively manage a greater number of vehicles and RSUs, research can concentrate on improving the aggregation techniques employed in federated learning. Durability can be further enhanced by creating flexible allocation of resources plans that adjust to shifting traffic densities and network conditions.

  • Decision-making in real time can be improved and delay can be decreased by combining edge AI capabilities with federated learning. investigating hybrid structures that enhance system efficiency by balancing the load and combining cloud and edge computing technologies.

  • Subsequent investigations may focus on improving differential privacy methods to offer more robust assurances while preserving model precision. Additional security for privacy can be achieved by enabling analyses on encrypted data through the use of homomorphic encryption in the federated learning process.

  • Using 5G innovations and investigating next wireless communication technologies (such as 6G) to enhance the exchange of information in vehicles for CPS in terms of speed, dependability, and security. creating flexible communication protocols that can react quickly to changing network circumstances and vehicle speeds.