Introduction

The Semantic Web has transformed the way data is represented, shared, and integrated across different domains, which has enabled the creation of complex networks of interconnected information, by providing a common framework to express and link data, thereby fostering innovation and driving decision-making in multiple fields, which include healthcare, finance, education and so on1. However, with its increasing growth and complexity, it is getting highly vulnerable to cyber attacks. The rapid proliferation of the interconnected data systems has created a new landscape of opportunities for the malicious actors to exploit the vulnerabilities and disrupt the flow of information. Because of the massive scale and complexity of the said networks, it becomes highly challenging to identify and respond to cybersecurity breaches in real time2. Moreover, the semantic nature of the interconnected data also introduced additional complexities because of the manipulation or distortion of the ontological relationships and data semantics to conceal malicious activities.

Cyber Threats are constantly evolving, adopting newer techniques and tactics to evade detection mechanisms. This sophistication of the threats poses significant challenges for intrusion detection systems (IDS), which makes it increasingly difficult to identify and respond to potential threats with the required accuracy3. The non-detection nature of the cyber intrusions can have major consequences that compromise the credibility of the critical security systems and services such as data confidentiality, integrity, and availability4,5. To address the emerging threats, researchers have proposed multiple IDS methodologies in the literature, which can be classified broadly in two major types: Signature Based Intrusion Detection Systems (SIDS), which rely highly on pre-defined patterns or signatures of the known attacks to detect intrusions. These systems are effective detection threats that are known in the pattern databases, however they encounter difficulties while detecting zero-day attacks or novel vulnerabilities in the systems; and Anomaly Based Intrusion Detection Systems (AIDS), which focus on identifying unusual or malign behaviour based on their deviation from the normal systems’ activities, however they have a higher possibility of false positives or negatives based on the complexity of the network traffic and user’s behaviour legitimacy6,7. Based on a similar landscape study, the importance of securing the Semantic Web and the interconnected data systems cannot be overstated as a single point of breach or a compromise can cause multiple consequences that include but are not limited to theft of sensitive information, disruption of critical services, and erosion of trust in the mentioned systems8,9. The lack of effective security measures in the systems can hinder the innovation, thereby stifle collaboration and limit the technologies’ potential benefits. With the increasing requirement of securing the Semantic Web, there is a need for the novel approaches to address the unique challenges that are posed by the complex systems.

Traditional cyber security strategies are often not well suited for addressing, detecting and mitigating the threats in interconnected data systems, as they require a deeper understanding of the semantic relationships, data semantics, and network architectures. The primary requirement to address the challenges is to adapt to a solution that has the capability of transforming based on the evolving landscape and address the characteristics of these systems, which involves the implementation of novel techniques like nature inspired cyber security algorithms to optimize security resource allocation, detect malicious activities, and responds to threats in real-time10.

It is known that in the context of securing Semantic Web frameworks, behavior trust11 is of paramount importance because the semantic and interconnected nature of these systems makes them uniquely vulnerable to sophisticated attacks that traditional security measures cannot adequately address. The proposed swarm optimization defense mechanism fundamentally relies on establishing behavior trust by continuously monitoring key performance metrics like Packet Delivery Ratio to establish a baseline of normal node behavior. This continuous calculation is critical for distinguishing between legitimate operations and malicious activities, such as Distributed Denial of Service attacks, which often manifest as subtle deviations in performance rather than obvious signature-based threats. By quantifying the trustworthiness of nodes based on their real-time behavior, the system can dynamically identify compromised or under-performing agents, autonomously reallocate security resources, and mitigate threats in real-time, thereby ensuring the integrity, reliability, and overall resilience of the entire Semantic Web infrastructure.

This research primarily focuses on developing swarm optimization inspired cyber security approach to enhance the resilience of Semantic Web and interconnected data systems against several types of cyber attacks. The proposed mechanism draws inspiration from the collective behaviour of insects to create a distributed strategy to allocate limited security resources and detect malicious activities. The goal, therefore falls under the category of development of a novel, nature-inspired security solution to effectively secure these systems, thereby ensuring trustworthiness and reliability of these critical systems12.

Contribution of the Paper: In this paper, we target securing the semantic web using Nature Inspired Cyber Security algorithm based swarm optimization, that proposes a novel approach to detect and respond to the anomalous behaviour of the nodes that are present in a complex network. The primary contributions of this paper can be summarized as follows: (i) Continuous calculation and monitoring methodology that has the capability to track the key network parameters like Packet Delivery Ratio for the nodes present in a network and the intermediaries, thereby enabling the detection of malicious activities and reporting them in real-time, (ii) Methodology to determine the probable nodes that are affected by low performance based on the network parameter readings, (iii) Utilization of performance metrics to detect attack surface like Distributed Denial of Service (DDoS) attack, which is a critical threat in semantic web and KG systems, (iv) Designing a novel swarm optimization algorithm to leverage collective behaviour of insects, thereby optimizing resource allocation, detecting anomalies and responding to the cyber threats in real-time.

The proposed defense framework is integrated into intrusion detection and prevention systems and is evaluated through a dedicated testbed to analyze node behavior under both normal and attack conditions. Experimental analysis of critical metrics, including packet flow and transfer rates, demonstrates that the Artificial Bee Colony (ABC) algorithm effectively performs dynamic resource optimization, thereby enhancing threat mitigation. The incorporation of a distributed adaptive defense mechanism enables Semantic Web nodes to autonomously adapt to evolving attack vectors, addresses the significant security gaps and reinforcing both the reliability and integrity of knowledge-driven infrastructures.

Related work

The research leverages an AI-assisted Computer Network Operations testbed which provides a Nature Inspired Cyber Security algorithmic approach, thereby delivering an adaptive defense strategy for robust applications13. Network Intrusion Detection Systems serve as the primary line of defense against the cyber threats that threaten the security and integrity of cyber-physical systems. These systems are crucial for identifying, monitoring and responding to several malicious activities, which have the capability of compromising the functionality of interconnected networks14. Attack scenarios in distributed robust systems begin at the core network layer; therefore, detection of anomalies becomes a complex operation in developing the Intrusion Detection strategies4,15,16 where the probability of behavior indicates normal, attack and defense scenarios17. The research18 provides a framework model that has showcased a 99% detection accuracy on real time datasets. Swarm optimization techniques have demonstrated their effectiveness in optimization of cyber defense strategies within the Semantic Web and IoT environments19,20. This enhances the security measures defined on the interconnected semantic data. These techniques showcase an adaptive and intelligent solutions, which hardens defenses and delivers optimal security against the evolving cyber threats in Semantic Web environments.

Table 1 Summary of related work and the datasets used.

Table 1 provides a summary of the applications of multiple Nature Inspired algorithms for Intrusion Detection and Prevention Systems. Although most works today concentrate on protecting complicated networks, our method introduces a new nature-inspired cyber security algorithm that utilizes swarm intelligence to counter the special vulnerabilities of semantic web and knowledge graph systems. In contrast to traditional approaches based on static or centralized analysis, our approach introduces a real-time response and continuous monitoring framework that utilizes the aggregate behavior of the agents to detect, localize, and counter threats such as Distributed Denial of Service (DDoS) attacks. The primary innovation in our work is in the synergistic blending of real-time monitoring of network parameters with a tailor-made swarm optimization algorithm, allowing for a more responsive and robust security stance against intelligent, coordinated threats.

Table 2 Comparison of related works and the proposed approach.

Table  2 compares our proposed security approach to existing research by highlighting the limitations of current methods and explaining how our work, which focuses on nature-inspired swarm optimization for real-time threat detection in the semantic web, offers a superior and more tailored solution.

Swarm Optimization algorithms are inspired by the natural behaviors of swarms, which uses their collective intelligence and self-organizing properties to tackle complex optimization challenges27. These algorithms, when applied to securing the Semantic Web, provide innovative solutions for managing security risks in interconnected data environments. For instance, the Nature Inspired Algorithms like Artificial Bees Colony (ABC) algorithm models indicate the foraging behavior of honey bees, where the bees work in parallel to find and exploit resources efficiently. In the context of cyber security, this approach helps to dynamically distribute security measures, detect probable threats, and optimize the defense strategies, thereby optimizing the overall defense strategies and enhancing the protection of the Semantic Web, which ensures its integrity and trustworthiness28,29,30.

As indicated in the Algorithm 1, the optimization process begins with initializing the population of the solution space candidates which are herein termed as ’agents’. Agents are then required to update their positions or states in general based on their adaptive nature or collective swarm intelligence. The swarm collectively refines its search process by adjusting the methodologies to explore the search space efficiently. The algorithm continues its execution process until the stop criteria is met which can be some satisfactory iteration end criteria. The best solution formed in the loop will be considered as the final outcome of the process.

Framework overview

The architecture as defined in Fig.  1 is intended to capture the complex network behavior of a semantic web setting, allowing for a testbed to be used for security and performance evaluation. It is organized into three main layers: the Routing Layer, the Clustering Layer, and the Host Layer.

The Routing Layer consists of three routers that are interconnected with each other: Router1 (R1), Router2 (R2), and Router3 (R3). These routers are the lifeline of the network, and they provide unhindered data transfer and connectivity between various clusters. The connectivity between them enables us to analyze how routing protocols and traffic management at the core level affect the overall health and security of the network.

The Clustering Layer is comprised of five switches or groups – C1SW, C2SW, C3SW, C4SW, and C5SW. Each group is a local intermediary that handles the connections between the routers and a set of host nodes. This layer plays a key role in the analysis of resource distribution, load distribution, and the encapsulation of abnormal behavior in individual network segments.

Lastly, the Host Layer is home to a diverse set of host nodes. These nodes are not homogeneous; they are designed with different network topologies that reflect the intricate data interactions and semantic links of actual semantic web settings. Such a design makes it possible for a detailed analysis of how various data flow patterns and node distributions influence network performance, security, and the spread of threats. Through the simulation of these particular conditions, the setup enables a full evaluation of how well our suggested security algorithm can detect and react to malicious behaviors in a very complex and networked system.

Algorithm 1
figure a

Generic swarm optimization algorithm.

The presented algorithm provides a effective utilization of swarm-based optimization. A set of agents is initialized with defined parameters such as population size, iteration limits, and stopping criteria. Each agent is assigned an initial position, representing a candidate solution. During the iterative process, the fitness of each agent’s position is evaluated using the objective function. The algorithm continuously compares these fitness values to identify and update the best solution found so far. This process is repeated until the stopping condition is met, at which point the algorithm outputs the position with the highest fitness as the optimized solution.

Problem formulation & fitness function

The proposed methodology in this research focuses on the implementation of Swarm Optimization techniques within an Intrusion Detection and Prevention system designed for the Semantic Web. This approach aims to enhance the security by leveraging the interconnected and semantic nature of these systems, to dynamically monitor and protect the network activities. Packet Delivery Ratio (PDR) is a suitable real-time measure of Distributed Denial of Service (DDoS) attacks in a semantic web scenario since it offers a direct, quantifiable measure of the impairment of network performance. During an attack, a DDoS attack sends malicious traffic to flood the network, and consequently, genuine data packets are dropped. This results in an abrupt and measurable reduction in PDR across the concerned nodes and paths. Real-time monitoring of PDR enables instantaneous identification of this PDR drop, serving as an early warning sign of an ongoing attack. Our swarm optimization approach is based on a definite goal: maximizing the network-wide PDR through ongoing identification and blocking of malicious nodes responsible for the PDR drop, thereby reconstituting the network to its original optimum status.

The equation  1 is used to determine the solution set based on the optimal fitness value determined by evaluating the packet delivery ratio of the nodes in the interconnected network \(pdr\_n\) in t time period, where n is the number of agents present in the swarm. The update process is conducted as determined in equation  2, where the packet delivery ratio is updated keeping \(\phi\) as a random metric and combining the comparison of the optimal solution with the initial solution and the fitness value evaluated based on the pdr value of the swarm.

$$\begin{aligned} & sol = optimal(pdr_n) \in time(t) \end{aligned}$$
(1)
$$\begin{aligned} & pdr_n^{(t+1)} = (pdr_n^{(t)} + \phi \cdot (pdr_{\text {optimal}, n} - pdr_n^{(t)})) \cdot \sum _{pdr \in swarm}{fit_{pdr}^{swarm}} \end{aligned}$$
(2)

The average Packet Delivery Ratio (PDR) value is determined by analyzing the packet size and the rate of flow of the packets through the nodes and intermediaries within the Semantic Web network. Considering the semantic relationships and data dependencies, this measurement provides insights into the efficiency and reliability of the data transmission across the interconnected nodes, which provides a way of assisting by identifying the potential disruptions or the security vulnerabilities within the Semantic Web infrastructure.

Implementation and experimental setup

Figure  1 showcases three routers - Router1 (R1), Router2 (R2) and Router3 (R3), that are interconnected to each other. The routers are thereafter connected to five switches or clusters - C1SW, C2SW, C3SW, C4SW and C5SW. The switches connect the routers and the host nodes. The host nodes showcase different network topologies as discussed in the research13. The host nodes encapsulate distinct semantic relationships and data interactions, which reflect the complex web of connections in Semantic Web environments. This configuration assists in developing an exploratory form of different network dynamics and topologies to assess the impact on data flow, security, and overall network performance within the Semantic Web framework. The algorithmic code base can be accessed from https://github.com/chirag-ganguli/Nature-Inspired-Swarm-Optimization-Paradigms-for-Securing-Semantic-Web.

Fig. 1
figure 1

Network architecture (including Malicious Nodes).

Algorithm 2
figure b

Improved swarm optimization technique in semantic web.

The Algorithm 2 defines the property of introducing modified Swarm Optimization Algorithm in Intrusion Detection Systems where the network parameter - Packet Delivery Ratio (PDR) is showcased to determine the fitness of the swarm agents that are present in the interconnected network. The optimal PDR value used to calculate the network’s fitness provides a way to determine the malicious nodes that have a probable chance of getting attached to the network, thereby reducing the overall attached agent’s fitness value.

Network architecture

In the network, three interconnected routers (Router 1, Router 2, Router 3) are attached with the five network intermediaries (clusters) which are further attached with the host nodes that are configured to be composed of multiple network topologies thereby providing a variation to the proposed defense strategy.

Table 3 Cluster Information of the testbed network.

Table 3 showcases that the clusters present in the interconnected network are attached to 10 nodes (host devices). To visualize the attack scenarios in our presented architecture, an internal network is established between the malicious nodes from the external environment and the network cluster host nodes. The mentioned testbed has been adapted to align with the current working scenario of this research.

Attack simulation

In order to emulate the DDoS attack, we used a locally created script within the NS-2 (Network Simulator 2) platform. The script was made to produce a large number of TCP SYN packets from several compromised nodes to an intended node in our network simulation. The attack was launched by configuring the rate of packet generation at a high intensity from every attacking node, with the overall attack lasting for multiple seconds to generate the results. This well-controlled simulation successfully simulated a real-world DDoS attack, flooding the target’s resources, resulting in a drastic decrease in its PDR. The employment of NS-2 facilitated control over the simulation parameters with high precision, which gave rise to a repeatable and steady environment to obtain performance measures and validate our algorithm’s effectiveness.

Key parameters used for swarm algorithm

For our swarm optimization algorithm, several key parameters were used to govern the behavior of the agents and the optimization process. These parameters, derived from the output of our custom NS-2 script, are detailed in Table 4. They include the Swarm Size (N), which determines the number of agents participating in the search; the Inertia Weight (w), which balances the agents’ exploration and exploitation tendencies; the Cognitive Coefficient (\(c_1\)) and Social Coefficient (\(c_2\)), which dictate the influence of an agent’s individual best performance and the swarm’s best performance, respectively; the Maximum Velocity (\(V_{max}\)), which constrains the speed of the agents31; and the Maximum Iterations (\(Iter_{max}\)), which sets the stopping condition for the optimization process. These parameters were meticulously tuned based on the network readings captured by our NS-2 simulation to ensure an optimal balance between efficient threat detection and effective resource allocation.

Table 4 Swarm algorithm parameters.

Results and discussion

Case study

Based on the proposed methodology, experimental analysis has been performed by connecting 5 malicious nodes each to several host nodes that are present in the network to perform malicious operations on the connected nodes which includes bombarding the host nodes with huge volume of packets thereby disrupting the availability of the node, which is referred to as Distributed Denial of Service attack. The connection of the malicious nodes is referenced in Table  5.

Table 5 Connection of malicious nodes.

Case 1 - DDoS on Cluster 4 (Tree Topology): Five malicious nodes (Mal0, Mal1, Mal2, Mal3, and Mal4) are attached to the Cluster 4 Host Node 7, which plays a critical role in processing and forwarding semantic data packets through the Cluster 4 switch within the semantic web network. The malign nodes are configured to generate a huge volume of traffic directed at the target nodes, thereby overwhelming the allocated resources and disrupting the normal flow of semantic data. The analyzed strategy involves injection of high volume of data packets into the network, therefore exploiting the semantic relationships and connections to bypass the traditional defenses. Figure 2 showcases the analyzed average packet delivery ratio in the normal, attack and proposed defense scenario.

Fig. 2
figure 2

Case 1: Average PDR - Normal, Attack, and Defense.

Case 2 - DDoS on Cluster 5 (Mesh Topology): Five malicious nodes (Mal0, Mal1, Mal2, Mal3, and Mal4) are connected to Cluster 5 Node 2 in the same manner as showcased in Case 1. The malign packets have the capability to disrupt the normal workflow of the semantic network, resulting in Distributed Denial of Service attack. The proposed defense, when applied to the targeted host node, dynamically reallocates the security resources and detects anomalies within the Semantic Web’s interconnected environment. This enhances the network’s resilience against similar distributed attacks. Figure 3 showcases the generated average packet delivery ratio in the normal, attack and proposed defense scenario.

Fig. 3
figure 3

Case 2: Average PDR - Normal, Attack, and Defense.

The defense performed slightly better in the Tree topology (Case 1) compared to the Mesh topology (Case 2). This can be attributed to the inherent redundancy/path complexity of the mesh network, which the swarm algorithm was able to leverage more effectively.

Conclusion and future work

The proposed study improves the security of the Semantic Web by analyzing the behavior of network intermediaries and host nodes across different topologies and application environments. Using a base testbed13, malicious nodes were introduced into defined network clusters to evaluate system performance under both normal and attack conditions. Key security metrics, including packet counts and transfer rates from malicious nodes to targeted hosts, were assessed to understand their impact on the overall functionality of the network. The prime objective of the proposed research is to integrate nature-inspired cyber defense mechanisms within intrusion detection and prevention systems tailored for the Semantic Web. By applying swarm optimization techniques such as the Artificial Bee Colony (ABC) algorithm-modeled on the foraging behavior of bees, the experimental results demonstrate significant improvements in security infrastructure. The ABC algorithm has shown strong capabilities in solving complex optimization problems, including function optimization, feature selection, and clustering, due to its high adaptability and integration strengths. Swarm optimization algorithms, through their collective intelligence and distributed learning, offer a powerful and adaptive framework to explore vast solution spaces and design defense strategies10,32. Their application ensures improved reliability, integrity, and resilience of Semantic Web systems against evolving cyber threats. Future works may involve extending the proposed approach to hybrid swarm intelligence frameworks by integrating multiple optimization techniques for improved detection accuracy. Researchers may also focus on incorporating machine learning and deep learning models with swarm-based defense mechanisms to adapt more effectively to dynamic attack patterns.