Introduction

Underwater Acoustic Networks (UANs) have emerged as critical enablers in marine science and technology, with applications ranging from marine engineering and environmental monitoring to national defense and disaster response. Given that oceans cover over 70% of the Earth’s surface, UANs—leveraging acoustic communication between sensors and autonomous underwater vehicles—play a pivotal role in resource exploration and large-scale underwater deployments1,2,3. Foundational advancements in areas such as underwater communication modems, routing protocols, Media Access Control (MAC) mechanisms, and localization methods have significantly enhanced UAN capabilities4,5. However, as the demand for extensive Underwater Wireless Sensor Networks (UWSNs) increases, several challenges remain unresolved.

Acoustic modem technology, the backbone of UWSNs, faces significant constraints in enabling high-throughput, long-term monitoring. Multi-hop routing schemes, while promising, often suffer from imbalanced energy consumption, unreliable links, and excessive delays, which degrade network performance under dynamic underwater conditions6,7. Moreover, underwater environments amplify challenges such as signal attenuation, variable propagation delays, and limited bandwidth, all of which constrain the scalability and reliability of UWSNs8. These limitations underscore the necessity for innovative, energy-efficient network protocols to support sustainable, long-term operations in underwater environments.

In addition to environmental constraints, the open nature of underwater channels exposes UWSNs to significant security vulnerabilities, including data breaches and unauthorized access9,10. Existing cryptographic techniques are computationally expensive and not applicable to resource-limited UWSNs, therefore the necessity of lightweight and efficient protection approaches has arisen.

To overcome these prompt constraints, this study introduces a dynamic and trustworthy networking construct that aligns with the vision of future communication paradigms and 6G technology, involving self-sustainability and intelligent networking. The proposed construct incorporates energy-neutral multi-hop routing and blockchain-based technology for better security, effective data transmission and boosted network life-time. Blockchain based cryptography methods like hashing and digital signatures are adapted in a Multi-Agent System (MAS) to offer intelligent and collaborative decision-making process from low level oceanography features to high-level cognitive behavior under complex underwater environment11,12.

UWSNs face unique challenges, including bandwidth limitations, interference, energy scarcity, and high-security risks, all of which restrict their operational viability as shown in Fig. 1. The underwater acoustic medium, characterized by high signal attenuation and substantial delays, hinders real-time data exchange, which is essential for applications like marine monitoring and disaster management4.

Fig. 1
figure 1

Abstract view of Novelty.

Dynamic underwater conditions, influenced by factors such as marine topography and ambient noise, lead to frequent disruptions in communication, necessitating adaptive strategies for reliable data transmission5. Furthermore, the absence of feasible battery recharging or replacement options in remote underwater deployments places a premium on energy-efficient protocols to ensure prolonged network operation6,7.

Our proposed system introduces an adaptive, multi-hop routing framework that employs intelligent node agents capable of autonomous, proactive decision-making. These agents dynamically assess link quality and adjust power levels to enhance communication efficiency and network lifespan. The framework incorporates robust cryptographic mechanisms, including hashing and digital signatures, alongside stringent access control protocols, to ensure comprehensive security against unauthorized access.

By integrating real-time adaptive strategies and robust security measures within a lightweight, energy-efficient framework, this work contributes to the broader goals of self-sustainability and intelligent networking, not only for underwater networks but also as a specialized extension of 6G paradigms. The system is particularly well-suited for critical applications such as environmental monitoring and disaster response, offering reliable, long-term operational capabilities in demanding underwater environments.

UANs have gained attention in marine science and technology for their applications in marine engineering, national defense, and environmental monitoring. With oceans covering over 70% of the Earth’s surface, UANs—utilizing acoustic communication among sensors and autonomous underwater vehicles—are crucial for resource exploration, disaster response, and other large-scale deployments1,2. Foundational research over recent decades has advanced components like Underwater Acoustic Communication Modems, routing protocols, MAC protocols, and localization methods4,5. Yet, as demand for extensive UWSNs grow, significant challenges remain.

Related work

UWSNs face unique challenges, including limited bandwidth, high power consumption, and harsh environmental conditions. Recent studies have introduced packet optimization, improved communication reliability, and the integration of IoT security with AI for enhancing underwater systems13. Blockchain has also been applied to UWSNs to secure data transmission and improve operational efficiency, though these solutions often lack integrated approaches for real-time adaptation, security, and energy efficiency14.

AI techniques, such as machine learning, have been widely used in UWSNs to optimize communication parameters, yet often without blockchain integration, leaving vulnerabilities in security15. Reinforcement learning has been explored for enhancing communication efficiency but lacks robust security frameworks, critical for resource-limited underwater systems16. Other studies focus on AI-based management of UWSNs, though they frequently overlook energy efficiency and adaptability in dynamic underwater environments17. Moreover, while AI and edge computing have been applied to applications like ocean monitoring, they fail to address inherent security risks, a gap our approach aims to bridge by combining AI with blockchain.

Security is a primary concern in UWSNs due to the open nature of acoustic channels and limited resources for cryptographic processing. Blockchain technology shows promise for improving data integrity and privacy but often faces computational demands incompatible with real-time processing18. Machine learning has been explored to enhance security in UWSNs, yet lacks adaptive mechanisms that simultaneously optimize security and energy efficiency. Our approach integrates blockchain with a MAS, creating an adaptive, secure framework that addresses limitations in traditional blockchain and AI security methods15.

Energy efficiency remains a pressing issue, as recharging nodes in UWSNs is challenging. Some studies introduce low-power blockchain frameworks, yet these fail to adapt dynamically to environmental changes19. Although reinforcement learning offers promise for energy optimization, it does not incorporate blockchain for secure data management16. The proposed framework utilizes MAS to dynamically adjust power levels while maintaining secure blockchain-based communication, effectively extending network lifespan.

The proposed protocol introduces several key advancements over existing AI-based and blockchain-integrated routing protocols, making it more efficient and adaptable for UWSNs. One of the main distinguishing features of this protocol is its integrated approach, which combines MAS, blockchain technology, and acoustic communication within a unified framework. This integration allows the protocol to simultaneously address multiple challenges, including energy efficiency, security, and real-time adaptability, which are critical for UWSNs operating in dynamic and resource-constrained environments. Another significant aspect is the incorporation of cognitive intelligence through BDI reasoning, which enables nodes to make autonomous, context-aware decisions. This enhances the adaptability of the network by allowing each node to dynamically adjust its routing strategy based on real-time conditions, optimizing both performance and resource utilization.

To ensure efficiency in blockchain implementation, the protocol adopts a lightweight consensus mechanism, which significantly reduces computational overhead. Traditional blockchain-based security mechanisms often require high processing power and energy consumption, making them unsuitable for underwater environments. By optimizing the consensus process, the proposed protocol maintains security while remaining feasible for resource-constrained UWSNs. Additionally, the literature review has been expanded to provide a comprehensive comparison with state-of-the-art methodologies. This comparison highlights the existing gaps in the field and demonstrates how the proposed protocol effectively addresses these limitations, offering an innovative solution that enhances network longevity, security, and adaptability in underwater communication networks.

Intelligent routing protocol designed for the smart UCN

We deploy a smart intelligent routing mechanism in underwater acoustic communication sensor network to provide a reliable system based on 3–4 applications. The conceptualization and implementation of the proposed system stem from a comprehensive understanding of the challenges identified in underwater communication systems. Leveraging the contributions outlined in the preceding section, the proposed system is meticulously designed to address the multifaceted intricacies inherent in underwater environments as shown in Fig. 2. The intelligent routing mechanism is primed to proactively manage bandwidth and interference challenges. Node agents strategically assess available communication links before data transmission, employing preemptive measures to mitigate collision risks, optimize bandwidth utilization, and avert potential linkage failures. This strategic approach not only optimizes overall network efficiency but also ensures the reliability of data transmission in dynamically changing underwater conditions.

Fig. 2
figure 2

Abstract view of the proposed system.

The proposed system adopts a multi-hop model and energy-aware methods to enable long-distance communication. Node agents continuously assess the energy levels of adjacent nodes, allowing for prudent data transmission and reducing the likelihood of packet loss in transit. The combination of these methods leads not only to more robust networks but also prolongs the lifetime of underwater communication systems.

A distance-aware network longevity paradigm further contributes to the proposed system’s sustainability. Node agents, equipped with the capability to gauge inter-nodal distances, modulate power levels judiciously to prevent network fatigue. This meticulous approach ensures sustained interoperability, addressing the imperative of prolonged network longevity.

In fortifying the security landscape, the proposed system integrates blockchain technology, implementing hashing and digital signatures in the data transmission process. Prior to data transmission, each node initiates a blockchain-based security protocol. The data is hashed (HashedDatai) and digitally signed (Signaturei):

This process ensures the integrity and authenticity of the transmitted data, safeguarding against tampering or unauthorized alterations.

Furthermore, the proposed system implements a node registration mechanism through blockchain. Each node is registered before deployment into the network, and a unique identity (NodeIDi) is assigned.

The data created by this registration provides the network with knowledge of authorized nodes and allows it to prevent access to unauthorized ones. This acts as a shield against malicious entities aiming to exploit protocol vulnerabilities.

This layer of protection extends beyond just internal investment but also acts to guard the network against external threats through blockchain security, creating a strong underwater communication method. This combined approach is in keeping with the overarching goal of the system to solve problems and complications in underwater settings.

Autonomous reasoning intelligent routing mechanism

The core of this system is an intelligent routing that utilizes autonomous and adaptive node agents with high-level decision-making abilities. Aspired by proactivity, autonomy, and adaptability principles, these agents are more sensitive to the dynamic aquatic nature. Increasing the complexity of underwater communication challenges, the integration of mobility, intelligence, autonomy, recursiveness, asynchronicity, and collaboration gives these agents a power to explore adaptation and negotiation by combining components for distributed problem-solving. Based on situation awareness paradigm, we apply BDI intelligent reasoning mechanism to improve cognitive intelligence of node agents.

Intelligent node sensors

The intelligence of each node sensor (Ni) is modeled as a dynamic system with adaptive rules described by the equation:

$$~{\mathop x\limits^{\cdot} _i}\left( t \right)=A\left( {{N_i},t} \right) \cdot {x_i}\left( t \right)+B\left( {{N_i},t} \right) \cdot {u_i}\left( t \right)$$
(1)

where \(\:\dot{x}_{i} \left( t \right)\) represents the internal state, and \(\:{u}_{i}\left(t\right)\) denotes the external stimuli at time t. The adaptive rules A and B dynamically adjust the node’s cognitive state based on environmental stimuli.

Definition 1: acquisition of context

Whenever intelligent nodes collect information. Data interpretation is the primary objective. For instance, a nodei wants to transmit the data, it checks the predefined rules \(\:\gamma\:\) for external context \(\:{u}_{i}\left(t\right)\)in \(\:{x}_{i}\left(t\right)\) which is a place where the rules are created into the Belief system of node =\(\:{\beta\:}^{i}\), regarding a context = \(\:{u}_{i}\left(t\right)\) mentioned in Eq. 1 and the Eq. below: which is modified into adaptive ones:

$$Set - of - Beliefs{\text{ }}=~{\beta ^i} \in ~\left( {{{\mathop x\limits^{\cdot} }_i}\left( t \right)} \right)$$
(2)

Where,

$$Set - of - Beliefs{\text{ }} = \:\beta \:^{i} \in \{ \begin{array}{*{20}c} {u_{i} \left( t \right)} \\ \end{array}$$
(3)

External context \(\:{u}_{i}\left(t\right)\) represents a different level of internal states \(\:{x}_{i}\left(t\right)\) where intelligent nodes adapt their behavior regarding the dynamic nature of the context.

Definition 2: interpret the data

Likewise, interpreting the data while using acquired information affects certain actions to achieve the optimal outcome. After nodei \(\:{x}_{i}\left(t\right)\) is true, nodei has rules or intentions \(\:\gamma\:\) to do \(\:\theta\:\) certain actions \(\:\alpha\:\).

$$\:\begin{array}{c}\gamma\:={\gamma\:}^{i}\left(\theta\:\right({\alpha\:}^{i}\left)\right)\end{array}$$
(4)

Definition 3: optimal solution

After taking certain actions the main focus of the intelligent reasoning autonomous agents have the desirable goal. Every rule that has been predefined to handle the complexity of the situation has an optimal goal to achieve.

Communication model

In the proposed communication model, a sophisticated intelligent reasoning mechanism routing protocol is meticulously formulated to surmount critical challenges inherent in underwater communication. The conceptual model introduces node sensors endowed with heightened cognitive capacities, enabling adaptive responses to the dynamic conditions prevailing during packet transfers as shown in Fig. 3.

Definition 1: acquisition of context

In the common case of communication among nodes as shown in algorithm 1, if the node wants to communicate in a network it has to be registered in the network, receiving a unique cryptographic key denoted as \(\:{K}_{i}\).

$$KB=\beta \in \left\{ {\begin{array}{*{20}{c}} {Key\,\,{K_i},} \\ {Sign\,\left( {{P_i},\,{K_i}} \right)} \end{array}} \right.\,\,\,\,\,\begin{array}{*{20}{c}} {if\,\,the\,\,key\,\,is\,\,valid} \\ {if\,\,signing\,\,is\,\,required} \end{array}$$
(5)

Definition 2: interpret the data

In the secure communication phase, intelligent nodes in algorithm 1 continuously monitor \(\:\left(\theta\:\right)\) attempts at secure communication \(\:\left(\theta\:\right)\). If a registered node attempts communication, the intelligent node uses its cryptographic key to encrypt the message (\(\:C\leftarrow\:Encrypt(message,{K}_{i})\)).

Acquiring the internal state \(\:{x}_{i}\left(t\right)\)of nodei certain \(\:\alpha\:\)actions will be performed to achieve the secure communication as shown in the Eq. below:

$$\gamma =\,{\gamma ^i}\,\left( {\theta \left( {{\alpha ^i}} \right)} \right) \in \,\left\{ {\begin{array}{*{20}{c}} {monitors\,\,communication} \\ {{K_i}} \\ {registered\,\, \in \,\,accepted} \\ {unregistered\,\, \notin \,\,accepted} \end{array}} \right.$$
(6)
figure a

Algorithm 1: Advanced Node Registration and Communication

Definition 3: optimal solution

In the inter-node communication phase, the algorithm watches communication attempts between nodes. If both nodes involved are registered, authentication occurs using their cryptographic keys (\(\:\text{Authenticate(}{K}_{i},{K}_{j})\)). If authentication is successful, secure communication is allowed; otherwise, the connection is denied.

  • \(\:M,s,n\vDash\:F(\exists\:{b}_{i}({u}_{i}\left)\right)\) where internal state of \(\:{n}_{i}{x}_{i}ϵ{u}_{i}\) belongs to external context of environment. In the model M, at state s where \(\:n={n}_{1},{n}_{2},\dots\:\), at point \(\:s\), in the future, in some existing state, the \(\:s\) holds the rules correctly. Therefore, in the very next state \(\:{s}_{i+1}\), node \(\:i\) has an intention to perform a certain action \(\:\alpha\:\) related to Eqs. 12.

  • \(\:M,s,n\vDash\:X\left({g}_{i}\right(q(\neg\:({u}_{i}\left)\right)\left)\right)\) where \(\:{x}_{i}\) \(\:\notin\:\) \(\:{u}_{i}\). In the model M, at state s where \(\:n={n}_{1},{n}_{2},\dots\:\), at point s, in the future, in some existing state, the formula \(\:{u}_{i}\) does not hold the rules correctly. Therefore, in the very next state \(\:{s}_{i+1}\), node i has the intention to perform a certain action \(\:\alpha\:\) regarding dying the connection.

Digital signature and hashing paradigm

The digital signature \(\:{\sigma\:}_{i}\) and hashing of each packet \(\:{P}_{i}\) are mathematically expressed as:

$$\:{\sigma\:}_{i}=Sign({P}_{i},{K}_{i})$$
(7)
$$\:H\left({P}_{i}\right)=Hash\left({P}_{i}\right)$$
(8)

where \(\:\text{Sign}\) represents the signing function with a private key \(\:{K}_{i}\), and \(\:\text{Hash}\) denotes the cryptographic hash function. Verification of the packet integrity at the receiver involves the inverse hash function \(\:{H}^{-1}\left({\sigma\:}_{i}\right)\).

figure b

Algorithm 2: Compact Blockchain-based Data Protection

The algorithm 2 is designed for secure data transmission using blockchain technology. The system initializes by generating a blockchain structure for secure data storage.

Definition 1: acquisition of context

In this case, if the node intends to communicate it has certain rules to follow like digital signature, and hashing. In the \(\:\beta\:\) of nodei, it has the acquired knowledge of the network regarding packet hash \(\:\text{Hash(}{P}_{i})\) and digital signature \(\:\text{Sign(}{P}_{i},{K}_{i})\).

$$\:KB={\beta\:}^{i}ϵ\{\begin{array}{c}\text{H}\text{a}\text{s}\text{h}\text{(}{P}_{i})\\\:\text{S}\text{i}\text{g}\text{n}\text{(}{P}_{i},{K}_{i})\end{array}$$
(9)

Definition 2: interpret the data

Acquiring the internal state communication as shown in the Eq. below:\(\:{x}_{i}\left(t\right)\)of nodei certain \(\:\alpha\:\) actions will be performed to achieve the secure communication as shown in the Eq. below:

$$\gamma = \gamma ^{i} \left( {\theta \left( {\alpha ^{i} } \right)} \right) \in \left\{ {\begin{array}{*{20}c} {DataEncrypt} \\ {Blochain{\mkern 1mu} consensus} \\ {Hash} \\ {\begin{array}{*{20}c} {ZeroKnowledge\Pr oof} \\ {Sign} \\ {Timestamp} \\ {Transmit} \\ \end{array} } \\ \end{array} } \right.{\text{ }}$$
(10)

Data Transmission: If a node wants to send data: The data is encrypted (\(\:ED\)) using a session key (\(\:\text{SessionKey}\)) for confidentiality as shown in the Eq. below:

$$\:\begin{array}{c}ED\leftarrow\:Encrypt(data,SessionKey)\end{array}$$
(11)

Blockchain consensus is achieved by:

$$\:BC\leftarrow\:BlockchainConsensus\left(\right)$$
(12)

The encrypted data is hashed (\(\:HD\)) with the previous block’s hash from the blockchain (\(\:BC.PrevHash\)) to create a fixed-size representation as shown in below:

$$\:\begin{array}{c}HD\leftarrow\:Hash(ED,BC.PrevHash)\end{array}$$
(13)

A zero-knowledge proof is performed by the following Eq. 14:

$$\:\begin{array}{c}ZKP\leftarrow\:ZeroKnowledgeProof\left(HD\right)\end{array}$$
(14)

If both blockchain consensus and the zero-knowledge proof are successful.

The hash is digitally signed:

$$\:\begin{array}{c}SD\leftarrow\:Sign(HD,SecretKey)\end{array}$$
(15)

The blockchain timestamp is included:

$$\:\begin{array}{c}BD\leftarrow\:IncludeTimestamp(SD,BC)\end{array}$$
(16)

The securely packaged data is transmitted as follows:

$$\:\begin{array}{c}TP\leftarrow\:Transmit\left(BD\right)\end{array}$$
(17)

Definition 3: optimal solution

  • \(\:M,s,n\vDash\:F(\exists\:{b}_{i}({u}_{i}\left)\right)\) where internal state of \(\:{n}_{i}{x}_{i}ϵ{u}_{i}\) belongs to external context of environment. In the model M, at state s where\(\:n={n}_{1},{n}_{2},\dots\:\), at point s, in the future, in some existing state, the \(\:{u}_{i}\) holds the rules correctly. Therefore, in the very next state \(\:{s}_{i+1}\), node i has an intention to perform a certain action \(\:\alpha\:\) related to Eqs. 613.

  • \(\:M,s,n\vDash\:X\left({g}_{i}\right(q(\neg\:({u}_{i}\left)\right)\left)\right)\) where \(\:{x}_{i}\notin\:{u}_{i}\). In the model M, at state s where \(\:n={n}_{1},{n}_{2},\dots\:\), at point s, in the future, in some existing state, the formula \(\:{u}_{i}\) does not hold the rules correctly. Therefore, in the very next state \(\:{s}_{i+1}\), node \(\:i\) has an intention to perform a certain action \(\:\alpha\:\) regarding dropping the data.

This compact algorithm ensures secure data transmission by incorporating blockchain consensus, encryption, hashing, and digital signatures, with a conditional mechanism to drop data in case of unsuccessful verification.

Fig. 3
figure 3

Autonomous agents communication model.

Multi-hop topology for energy-distance-aware

The algorithm 3 is designed to optimize data transfer in an intelligent node sensor network by considering the distance between them and the sink node autonomous agents, as well as the energy levels of individual nodes. This algorithm is a combination of distance and energy aware multi-hop.

$$KB={\beta ^i}\, \in \,\left\{ {\begin{array}{*{20}{c}} {\frac{{d{E_i}}}{{dt}}} \\ {D\left( {i,j} \right)} \end{array}} \right.$$
(18)

Definition 1: acquisition of context

When intelligent node wants to communicate it checks for the distance function \(\:D(i,j)\) which is defined in terms of spatial coordinates as:

$$\:D(i,j)=\sqrt{({x}_{i}-{x}_{j}{)}^{2}+({y}_{i}-{y}_{j}{)}^{2}+({z}_{i}-{z}_{j}{)}^{2}}$$
(19)

where \(\:({x}_{i},{y}_{i},{z}_{i})\) and \(\:({x}_{j},{y}_{j},{z}_{j})\) represent the 3D coordinates of nodes i and j. The communication model dynamically selects a multi-hop topology when \(\:D(i,j)>T\), with T being the predefined threshold distance.

Additionally, energy assessment for each node (Ei) is modeled as a dynamic parameter evolving over time:

$$\:\frac{d{E}_{i}}{dt}=-{\alpha\:}_{i}\cdot\:{I}_{i}\left(t\right)$$
(20)

where \(\:{I}_{i}\left(t\right)\) is the energy consumption rate at time t, and \(\:{\alpha\:}_{i}\) represents the energy dissipation coefficient. The energy-aware routing strategy selects nodes with \(\:{E}_{i}>{E}_{\text{threshold}}\) to optimize energy consumption during data transfer.

Definition 2: interpret the data

In specific, the key components are explained below for algorithm 3:

Input:

- Node position \(\:{P}_{i}\), Sink position \(\:{P}_{\text{sink}}\), Threshold distance \(\:{D}_{\text{thresh}}\), Node energy \(\:{E}_{i}\).

Acquiring the internal state \(\:{x}_{i}\left(t\right)\) of nodei for certain Eqs. 1618 certain \(\:\alpha\:\) actions will be performed to adapt the topology as shown in the Eq. below:

$$\gamma ={\gamma ^i}\left( {\theta \left( {{\alpha ^i}} \right)} \right)\, \in \left\{ {\begin{array}{*{20}{c}} {{P_{\sin k}}} \\ {{P_i}} \\ {{D_{thresh}}} \\ {{E_i}} \end{array}} \right.$$
(21)

Communication Check:

If the distance between the node (\(\:{P}_{i}\)) and the sink node (\(\:{P}_{\text{sink}}\)) exceeds the defined threshold (\(\:{D}_{\text{thresh}}\)). The direct or multi-hop topology will be adapted.

Definition 3: optimal solution

The desirable state is adapting the topology to enhance the network reliability whether the direct or multi-hop paradigm.

Multi-Hop Mode.

For each neighboring node (\(\:{N}_{j}\)):

If the node’s energy (\(\:{E}_{i}\)) is above a defined threshold (\(\:{E}_{\text{thresh}}\)):

Transmit data to the neighboring node (\(\:{N}_{j}\)).

Direct Mode:

If the distance is within the threshold, transmit data directly to the SinkNode.

  • \(\:M,s,n\vDash\:F(\exists\:{b}_{i}({u}_{i}\left)\right)\) where internal state of \(\:{n}_{i}{x}_{i}ϵ{u}_{i}\) belongs to external context of environment. In the model M, at state s where \(\:n={n}_{1},{n}_{2},\dots\:\), at point s, in the future, in some existing state, the \(\:{u}_{i}\) holds the rules correctly. Therefore, in the very next state \(\:{s}_{i+1}\), node i has an intention to perform a certain action \(\:\alpha\:\) related to Eqs. 1618.

figure c

Algorithm 3: Energy-Efficient Multi-Hop Communication

This algorithm allows nodes to intelligently choose between direct and multi-hop communication modes based on their distance from the sink node. In multi-hop mode, nodes consider the energy levels of neighboring nodes to optimize data transfer.

Network model

In the network model as shown in Fig. 4, we address pivotal challenges through the integration of an intelligent node routing mechanism. This mechanism is designed to not only handle security parameters but also exhibit intelligence in detecting and countering attacks occurring at the physical layer, ensuring the integrity of the communication. Additionally, the model employs registration mechanisms to prevent unauthorized access, enhances network operation time through power adjustment, and prioritizes information flow for critical applications. The intelligent routing protocol we design has three layers of decision mechanism in the system. First is situation acquisition, second is intelligent reasoning mechanism where rules are deployed to take certain actions, and third one, is the desirable state to achieve optimal solutions.

Fig. 4
figure 4

Autonomous agents network model.

The intelligent node routing mechanism in a network model is characterized by a set of dynamic equations governing the decision-making process:

$$\:{\text{Decision}}_{i}\left(t\right)=Intelligence({N}_{i},t)$$
(22)

where \(\:{\text{Decision}}_{i}\left(t\right)\) represents the decision made by node \(\:i\) at time \(\:t\) based on its intelligence, encapsulating security, attack detection, and communication prioritization.

Node registration is modeled as a registration function:

$$\:{\text{Registration}}_{i}=Register\left({N}_{i}\right)$$
(23)

where ensuring that each node \(\:i\) is registered before initiating data transmission. Access control is implemented through an access function:

$$\:{\text{Access}}_{i}=Authorize\left({N}_{i}\right)$$
(24)

where \(\:{\text{Access}}_{i}\) indicates the authorization status of node \(\:i\). Unauthorized nodes are denied access, creating a secure network environment. The blockchain layer uses a lightweight consensus mechanism based on Proof of Authority (PoA), where only authorized nodes participate in validation. This reduces computational overhead while ensuring data integrity.

Attack detection and response

Definition 1: acquisition of context

In the case of attack detection, an intelligent node is formulated as a continuous monitoring process at the physical layer:

$$\:KB={\beta\:}^{i}ϵ\{\begin{array}{c}{\text{Attack}}_{i}\left(t\right)=Monitor({N}_{i},t)\end{array}$$
(25)

where \(\:{\text{Attack}}_{i}\left(t\right)\) is a binary variable indicating the occurrence of an attack at node i at time t. The response to attacks involves an adaptive countermeasure:

$$\:KB = \beta \:^{i} \in \{ \begin{array}{*{20}c} {\text{Re} sponse_{i} \left( t \right) = Adapt(N_{i} ,t)} \\ \end{array}$$
(26)

with \(\:{\text{Response}}_{i}\left(t\right)\) representing the countermeasure initiated by node \(\:i\) in response to an attack.

Definition 2: interpret the data

The algorithm 4 is designed to enhance the security at the physical layer by dynamically responding to various types of attacks. During initialization, the system deploys dynamic security measures to adapt to potential threats.

In the monitoring and response phase, the intelligent routing algorithm employs specific actions for different attack scenarios:

Jamming attack

In the case of a jamming attack, the algorithm activates adaptive frequency hopping (\(\:AFH\)) to mitigate the impact. An alert (\(\:A\)) is generated, indicating the detection of a jamming attack and the activation of adaptive frequency hopping.

Acquiring the external state \(\:{u}_{i}\left(t\right)\) of the network N certain \(\:\alpha\:\) actions will be performed to alert is then broadcast (\(\:B\)) to inform the entire network.

$$\:\begin{array}{*{20}c} {\gamma \: = \gamma \:^{i} (\theta \:(\alpha \:^{i} )) \in \{ \begin{array}{*{20}c} {AFH} \\ {\:B} \\ \end{array} } \\ \end{array}$$
(27)

MITM attack

For a Man-in-the-Middle (MITM) attack, the algorithm establishes secure channels (\(\:SC\)) using public-key cryptography. An alert (\(\:A\)) is generated, signaling the detection of a MITM attack and the establishment of secure channels. The alert is broadcast (\(\:B\)) to notify the network.

Acquiring the external state \(\:{u}_{i}\left(t\right)\) of the network \(\:N\) certain \(\:\alpha\:\) actions will be performed to alert is then broadcast (\(\:B\)) to inform the entire network as shown in the Eq. below:

$$\gamma ={\gamma ^i}\left( {\theta \left( {{\alpha ^i}} \right)} \right)\, \in \,\left\{ {\begin{array}{*{20}{c}} {SC} \\ B \\ A \end{array}} \right.$$
(28)

Eavesdropping attack

In the event of an eavesdropping attack, the algorithm applies artificial noise injection (\(\:ANI\)) to obfuscate the eavesdropped data. An alert (\(\:A\)) is generated, indicating the detection of an eavesdropping attack and the application of artificial noise injection.

Acquiring the external state \(\:{u}_{i}\left(t\right)\) of the network \(\:N\) certain \(\:\alpha\:\) actions will be performed to alert is then broadcast (\(\:B\)) to inform the entire network.

$$\gamma ={\gamma ^i}\left( {\theta \left( {{\alpha ^i}} \right)} \right)\, \in \,\left\{ {\begin{array}{*{20}{c}} {ANI} \\ B \\ A \end{array}} \right.$$
(29)
figure d

Algorithm 4: Advanced Physical Layer Security and Attack Detection

Relay attack

For relay attacks, the algorithm validates signal timing (\(\:VST\)) to detect and mitigate the attack. An alert (\(\:A\)) is generated, signaling the detection of a relay attack and the initiation of signal timing validation. The alert is broadcast (\(\:B\)) to inform the entire network.

Acquiring the external state \(\:{u}_{i}\left(t\right)\) of the network \(\:N\) certain \(\:\alpha\:\) actions will be performed to alert is then broadcast (\(\:B\)) to inform the entire network.

$$\gamma ={\gamma ^i}\left( {\theta \left( {{\alpha ^i}} \right)} \right)\, \in \,\left\{ {\begin{array}{*{20}{c}} {VST} \\ B \\ A \end{array}} \right.$$
(30)

Sybil attack

Sybil attacks is verified by node identity and alerting the network. When a Sybil attack is detected, the system calls VerifyIdentity() to check if the node is legitimate, storing the result in VI (Verification Information). If VI confirms a fake identity, an alert is generated (\(\:A\leftarrow\:Alert(Sybil,VI)\)) and broadcast (\(\:B\leftarrow\:Broadcast\left(A\right)\)) to all network nodes. VI ensures accurate detection using cryptographic verification, preventing identity spoofing. This approach enhances security by enabling real-time attack detection with minimal false positives, ensuring only legitimate nodes participate in the network.

Definition 3: optimal solution

This intelligent routing algorithm provides a dynamic and adaptive response to attacks, ensuring the network is promptly informed and countermeasures are applied accordingly.

  • \(\:M,s,n\vDash\:F(\exists\:{b}_{i}(Attackdetected\left)\right)\) where internal state of \(\:{n}_{i}\) in the network \(\:x_{i} \in u_{i}\) belongs to external context of environment. In the model \(\:M\), at state \(\:s\) where \(\:n={n}_{1},{n}_{2},\dots\:\), at point \(\:s\), in the future, in some existing state, the \(\:\text{Attackdetected}\) holds the rules correctly. Therefore, in the very next state \(\:{s}_{i+1}\), node \(\:i\) has an intention to perform a certain action \(\:\alpha\:\) related to Eqs. 2225.

Power adjustment for network operation time enhancement

Definition 1: acquisition of context

The power adjustment mechanism ensures efficient network operation and enhanced lifetime:

$$\:\beta \:^{i} \in {\text{Power}}_{i} \left( t \right) = AdjustPower(N_{i} ,t)$$
(31)

where \(\:{\text{Power}}_{i}\left(t\right)\) represents the power level of node \(\:i\) at time \(\:t\). The adjustment is based on communication proximity, preventing network saturation and optimizing overall network lifetime.

Definition 2: interpret the data

The algorithm 5 is designed to optimize data transmission in a network by adapting the transmit and receive power of nodes based on the distance between them.

Additionally, it considers the states of links, including idle, running, and busy. The key components by which the actions are taken defined below:

- Node position \(\:{P}_{i}\), Threshold distance \(\:{D}_{\text{thresh}}\), Transmit power \(\:TxPowe{r}_{i}\), Receive power \(\:RxPowe{r}_{i}\), Link states \(\:LinkState{s}_{i}\).

Likewise, interpreting the data while using acquired information effects certain actions to achieve the optimal outcome. In Eq. 32, after \(\:nod{e}^{i}\) \(\:{x}_{i}\left(t\right)\) is true, nodei has rules or intentions \(\:\gamma\:\) to do \(\:\theta\:\) certain actions \(\:\alpha\:\).

$$\gamma ={\gamma ^i}\left( {\theta \left( {{\alpha ^i}} \right)} \right)\,\, \in \,\,\left\{ {\begin{array}{*{20}{c}} {{P_i}} \\ {{D_{thresh}}} \\ {TxPowe{r_i}} \\ {\begin{array}{*{20}{c}} {RxPowe{r_i}} \\ {LinkState{s_i} \in \,\,\left\{ {\begin{array}{*{20}{c}} {Idle} \\ {Busy} \\ {Running} \end{array}} \right.} \end{array}} \end{array}} \right.$$
(32)
Definition 3: optimal solution

The desirable state is the power adjustment and a strong quality link selection for efficient data transmission.

Data transmission:

For each neighboring node \(\:{N}_{j}\):

If the distance between the nodes (\(\:\text{Distance(}{P}_{i},{P}_{j})\)) is less than the threshold distance (\(\:{D}_{\text{thresh}}\)):

Adjust transmit (\(\:TxPowe{r}_{i}\)) and receive (\(\:RxPowe{r}_{i}\)) power based on the distance to optimize energy consumption and avoid network overheating.

If the link state (\(\:LinkState{s}_{i}\left[{N}_{j}\right]\)) is idle or running:

Select \(\:{N}_{j}\) as the target node for data transmission.

Transmit data through the selected link.

If the link state is busy:

Skip data transmission for now to prevent interference and conserve resources.

  • \(\:M,s,n\vDash\:F(\exists\:{b}_{i}({u}_{i}\left)\right)\) where internal state of \(\:{n}_{i}\) in the network \(\:x_{i} \in u_{i}\) belongs to external context of environment. In the model \(\:M\), at state \(\:s\) where \(\:n={n}_{1},{n}_{2},\dots\:\), at point \(\:s\), in the future, in some existing state, the (\(\:{D}_{\text{thresh}}\)) is less than the predefined rule. Therefore, in the very next state \(\:{s}_{i+1}\), node \(\:i\) has an intention to perform a certain action \(\:\alpha\:\) regarding (\(\:TxPowe{r}_{i}\)) and (\(\:RxPowe{r}_{i}\)).

  • \(\:M,s,n\vDash\:F(\exists\:{b}_{i}({u}_{i}\left)\right)\) where internal state of \(\:{n}_{i}\) in the network \(\:x_{i} \in u_{i}\) belongs to external context of environment. In the model \(\:M\), at state \(\:s\) where \(\:n={n}_{1},{n}_{2},\dots\:\), at point \(\:s\), in the future, in some existing state, the (\(\:LinkState{s}_{i}\left[{N}_{j}\right]\)) hold the predefined rules. Therefore, in the very next state \(\:{s}_{i+1}\), node \(\:i\) has an intention to perform a certain action \(\:\alpha\:\) regarding link selection. However, if the (\(\:LinkState{s}_{i}\left[{N}_{j}\right]\)) do not hold the predefined rules then skip the transmission for now.

figure e

Algorithm 5: Adaptive Power and Link Selection with Link States

Prioritized information flow for critical applications

Definition 1: acquisition of context

Information prioritization is governed by a set of parameters for critical applications as shown Eq. below:

$$\:\beta \:^{i} \in \Pr iority_{i} \left( t \right) = \Pr ioritize(N_{i} ,t))$$
(33)

Indicating the priority level assigned to information at node \(\:i\) at time \(\:t\). This prioritization ensures that data related to early warning systems, surveillance information, object tracking, etc., are expedited to the destination node for timely actions.

Definition 2: interpret the data

Algorithm 6 is designed to handle various scenarios in an underwater sensor network, including unauthorized activities, natural disasters, and node status issues. The algorithm relies on mathematical formulations as given below to determine when to generate alerts and forward relevant data to a central sink node.

$$\gamma ={\gamma ^i}\left( {\theta \left( {{\alpha ^i}} \right)} \right)\, \in \,\left\{ {\begin{array}{*{20}{c}} {{D_{received}}\left\{ {\begin{array}{*{20}{c}} {Missile\,Launch} \\ {Oil} \\ {Object\,tracking} \\ {Earthquake} \end{array}} \right.} \\ {{S_i}\,\,node\,status \in \,\left\{ {Node\,\,attacks} \right.} \end{array}} \right.$$
(34)

Alert Generation:

  • If unauthorized activity is detected in \(\:{D}_{\text{received}}\) through \(\:\text{IsUnauthorizedActivity}\text{(}{D}_{\text{received}})\):

$$\:{A}_{i}\leftarrow\:GenerateUnauthorizedActivityAlert\left({D}_{\text{received}}\right)$$
(35)
  • Similarly, the algorithm checks for other specific events such as underwater earthquakes, oil leaks, missile launches, unauthorized object tracking, and imminent node failure. Each event is detected using corresponding functions (IsUnderwaterEarthquake, IsOilLeak, IsMissileLaunch, IsUnauthorizedObjectTracking, \(\:\text{IsImminentFailure}\)), and alerts are generated accordingly.

  • The generated alerts are then sent to the sink node (\(\:{S}_{\text{sink}}\)) for further analysis and appropriate actions.

Definition 3: optimal solution

Desirable state or goal for this case it to generate an alert for unauthorized activities, node status or early warning systems.

This algorithm provides a systematic and formulaic approach to alert generation and data forwarding, enhancing the efficiency of an underwater sensor network. It ensures that alerts are generated based on mathematical conditions, allowing for precise detection of unauthorized activities, disasters, and node status issues.

  • \(\:M,s,n\vDash\:F(\exists\:{b}_{i}({u}_{i}\left)\right)\) where internal state of \(\:{n}_{i}\) in the network \(\ \:x_{i} \in u_{i}\) belongs to external context of environment. In the model \(\:M\), at state \(\:s\) where \(\:n={n}_{1},{n}_{2},\dots\:\), at point \(\:s\), in the future, in some existing state, the \(\:{D}_{\text{received}}\) and/ or \(\:{S}_{i}\) hold the predefined rules. Therefore, in the very next state \(\:{s}_{i+1}\), node \(\:i\) has an intention to perform a certain action \(\:\alpha\:\) regarding \(\:{A}_{i}\).

The computational complexity of the proposed algorithms is analyzed using Big-O notation. The key operations include:

  • Routing Decision-Making: O(n), where n is the number of neighboring nodes.

  • Blockchain Verification: O(m), where m is the number of transactions in the blockchain.

  • Attack Detection and Response: O(k), where k is the number of monitored parameters.

The overall complexity is O(n + m + k), which is efficient for resource-constrained UWSNs.

figure f

Algorithm 6: Alert Generation for Unauthorized Activities, Disasters, and Node Status

Experimental setup and results

This section represents a system requirements and simulation-based performance analysis of the UASN. The experimentation happens in Ubuntu 22.04.2 LTS with the Intel core processor i7 CPU@2.90GHzx16. We use latest network simulator with 3.37 version to run the experimentation framework for UASN. We run hundreds of simulations, generating thousands of data point.

This paper presents the implementation of the proposed situation-aware link scheduling energy optimizing routing protocol using the NS-3 simulation platform. The given framework facilitates experimentation in an underwater environment.

Table 1 Defined parameters to use in simulations.

Table 1 shows the parameters we use in the simulation. In order to transmit the data to the sink node, a network configuration consisting of 100 nodes is employed as shown in Fig. 5.

Fig. 5
figure 5

Nodes in simulation setup.

Theoretical analysis

In the realm of three-dimensional underwater wireless sensor networks (3D UWSNs), addressing the persistent challenge of extending operational network duration and minimizing data transmission time has prompted the development of innovative routing protocols. The Geographic and Opportunistic Routing Protocol (GCORP) stands out as one such solution, integrating principles from geographic and opportunistic routing to enhance various network metrics, including energy consumption, end-to-end delay, and overall network lifetime. Researchers have also introduced the Power-Efficient Routing (PER) protocol20. This protocol operates in two key phases: the selection of forwarding nodes based on factors such as distance, angle between neighboring nodes, and current residual energy, followed by a trimming mechanism applied to the forwarding tree to reduce surplus energy consumption and prevent excess packet forwarding. Notably, PER optimizes resource usage by eliminating the need to gather data from all neighboring nodes for selecting forwarding nodes, thereby minimizing additional memory usage and communication overhead. Meanwhile, MARL-MC and MLAR are geared towards optimizing the optimal routes in underwater communication network systems, contributing to the ongoing efforts to enhance the efficiency and performance of 3D UWSNs21,22.

The conventional equations are used to handle energy, E2E delay, and network lifetime to provide the efficient underwater network are discussed as follows20,21,23,24:

Energy consumption

After running several rounds, we came to the results where we found the performance of CI, GCORP, PER, MARL-MC, and MLAR for energy consumption. Usually for energy consumption the equation is used as follow:

  • Initialization:

$$\:{E}_{\text{residual}}^{\left(0\right)}={E}_{\text{initial}}$$
(36)

Here, \(\:{E}_{\text{residual}}^{\left(0\right)}\) represents the initial energy of the node.

  • Iterative update after each transmission:

$$\:{E}_{\text{residual}}^{(t+1)}={E}_{\text{residual}}^{\left(t\right)}-{E}_{\text{consumption}}^{\left(t\right)}$$
(37)

After each data transmission, the remaining energy is updated by subtracting the energy consumed during that transmission, \(\:{E}_{\text{consumption}}^{\left(t\right)}\). This approach ensures that the residual energy is updated iteratively, providing a more accurate representation of the energy available at each step.

In Fig. 6, we can observe the performance of PER energy consumption is above 210 J every packet transmission close to GCORP which has less than 210 J across 200 J. We also observe the energy consumption results for MARL-MC, and MLAR. Both systems based on reinforcement learning where both system energy consumption lies under 180 to 150 J. They both used to find optimal routing path. The proposed protocol outperforms each technique mentioned above. It energy consumption ratio after every transmission is less than 120 J as shown in Fig. 6.

Fig. 6
figure 6

Theoretical analysis vs. proposed approach for energy.

Latency performance

In addition, we conduct a comparison of latency performances of GCORP, PER, MARL-MC, MLAR, and CI as shown in Fig. 7a.

$$\:{D}_{\text{end-to-end}}={D}_{\text{prop}}+{D}_{\text{trans}}+{D}_{\text{proc}}+{D}_{\text{queue}}$$
(38)

\(\:{D}_{\text{end-to-end}}\) is the total end-to-end delay, encompassing propagation delay, transmission delay, processing delay at intermediate nodes, and queuing delay. It quantifies the total time taken for a packet to traverse the multi-hop path from source to destination.

The latency performance of MARL-MC and GCORP has been 7 to 8 s for every transmission, whereas the performance of PER is 6 s for each transmission. The latency delay for MLAR and CI is close enough. MLAR lies between 2 and 3 s and CI lies under 2 s.

However, with the single and multi hop of CI method shows a different result. In single hop the CI latency upto 4 s as a distance increases however, with the multi-hop the latency goes upto 2 s as a distance increases. The multi-hop technique has been proven quite promising for CI regarding latency and energy consumption as shown in Fig. 7b.

PDR ratio and network lifetime

The network lifetime is the period of time in which the initial node of the network exhausts all of its energy reserves.

$$\:NL{T}_{Avg.}=\frac{\stackrel{K}{\sum\:_{n=1}}(S{T}_{n}-F{T}_{n})}{K}$$
(39)

where:

  • \(\:K\) is the total number of nodes in the network.

  • \(\:S{T}_{n}\)represents the starting time of node \(\:n\).

  • \(\:F{T}_{n}\) represents the failure time of node \(\:n\), i.e., when its energy is fully depleted.

Fig. 7
figure 7

Theoretical analysis vs. proposed approach for latency. (a) Latency of Proposed with theoretical analysis. (b) Delay in single hop vs. multi hop.

The term \(\:(S{T}_{n}-F{T}_{n})\) denotes the operational duration of each node before it runs out of energy. By summing these values for all nodes and dividing by the total number of nodes (\(\:K\)), we obtain the average network lifetime.

Likewise, we conduct the performance measurement of network lifetime and PDR between the proposed methodology and other methods including GCORP, PER, MARL-MC, and MLAR. It is observed that the PDR ratio for MARL-MC and MLAR lie under 86–90% although MLAR performs better with 90%. The GCORP lies under 87–88% for PDR ratio. Whereas the PER ratio lies under 80–82%. The CI performs well with above 90% for PDR ratio as shown in Fig. 8a. According to Network lifetime, for GCORP and PER stand between 1400 and 1550 s. MARL-MC and MLAR are under 1670–1725 s for the network to be depleted. The CI proposed network average lifetime is above 1850 and it outperforms every other method as shown in Fig. 8b.

Attacks detection

The Jamming Attack demonstrates the highest detection accuracy at 99.2%, indicating that the system is highly effective in identifying and mitigating this type of attack. Furthermore, it has the lowest false positive rate (0.9%), meaning that the likelihood of mistakenly classifying normal network behavior as an attack is minimal. Another advantage of the detection system for this attack is its fast response time of 100 milliseconds, making it the most efficiently handled threat among those listed.

The Sybil Attack is also well-detected, with a high detection accuracy of 98.5% and a relatively low false positive rate of 1.2%. This suggests that the system is proficient at distinguishing Sybil Attacks from normal traffic. However, its response time is slightly higher at 120 milliseconds, meaning that while it is effectively identified, it takes longer to respond compared to Jamming Attacks.

Fig. 8
figure 8

Theoretical analysis vs. proposed approach for PDR and Network Lifetime. (a) PDR of Proposed vs. theoretical (b) NL of Proposed vs. theoretical analysis.

Fig. 9
figure 9

Attacks detection.

The Replay Attack shows good detection performance, achieving a detection accuracy of 97.8%, which is slightly lower than that of the Sybil and Jamming Attacks but still highly effective. The false positive rate for this attack is 1.5%, which is moderate but still within an acceptable range. The response time for the Replay Attack is 110 milliseconds, indicating that while the system can detect it quickly, it is not as fast as the Jamming Attack detection.

Finally, the Man-in-the-Middle (MITM) Attack presents the greatest challenge for detection. It has the lowest detection accuracy at 96.7%, suggesting that the system struggles more with identifying this type of attack. Additionally, it has the highest false positive rate of 1.8%, meaning there is a greater chance of normal network activity being misclassified as an attack. The response time for MITM attack detection is the slowest at 130 milliseconds, making it the most time-consuming threat to detect and mitigate. This highlights the need for further improvements in detecting MITM attacks to enhance overall security as shown in Fig. 9.

CI sensor node actions

The cognitive intelligent CI node agent performs several intentions \(\:g=\{{\alpha\:}_{1},{\alpha\:}_{2},\dots\:,{\alpha\:}_{n}\}\) in order to achieve its desire d, where \(\:{g}_{i}\) is the set of actions performed by agent i in state s. The set of actions a node agent can perform are as follows:

  • \(\:{g}_{i}\left(q\right({\alpha\:}_{i}\left)\right)\) After matching the context to the beliefs, node agent i has an intention (related to the belief), therefore, the actions \(\:\alpha\:\) of agent i have been performed q in order to achieve their desires.

  • \(\:{b}_{i}\text{ask(}\text{i,j,ϕ}\text{)}\): Node agent i believes that it should ask agent j about the specific context \(\:\varphi\:\). Similarly, \(\:{b}_{j}\text{Tell(j,i,ϕ)}\), where node agent j believes that it should tell agent i about the specific context.

  • \(\:{b}_{i}\neg\:\varphi\:\): Node agent i has a belief about the specific context that is not detected in the environment. Instead of being idle, agent i informs its corresponding agents about the current situation.

We utilize temporal logic to model the system because it effectively captures the potential behavior of the system. This approach not only verifies the system’s correctness but also monitors its execution. It aids in monitoring the progression from the current state to the next state, including specific actions. We apply these properties to our proposed system to guide the actions of node agents and to assess whether a BDI-based node agent can generate the correct output for all valid inputs.

Definition 1: acquisition of context

Whenever sensor nodes collect information. Data acquisition is the primary objective. For instance, a nodei wants to transmit the data, it checks the predefined rules for \(\:dis\), \(\:d\), \(\:l\), \(\:En\) and \(\:{L}_{A}\) in \(\:{x}_{i}\left(t\right)\) which is a place where the rules are created into the Belief system of nodei = \(\:{\beta\:}^{i}\), regarding a context =\(\:{u}_{i}\left(t\right)\) and the Eq. below which is modified into adaptive ones:

Fig. 10
figure 10

Energy consumption using adaptive topology.

$$\:KB = \beta \:^{i} \in x_{i} \left( t \right)$$
(40)

where,

$$KB={\beta ^i}\, \in \,\left\{ {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {Nod{e_E}=\,Node\,Energy} \\ {{M_{node}}=Node\,mobility} \\ {Nois{e_{channel}}=Acoustic\,channel\,noise} \end{array}} \\ {Interference\,Link={I_{link}}} \\ {di{s_{link}}=dis\tan ce} \\ {channe{l_{freg}}=Channel\,frequency} \end{array}} \\ {node\operatorname{Re} g=Node\,register} \\ {tx=transmission\,power} \\ {\begin{array}{*{20}{c}} {Topo\log y=Threshold\,dis\tan ce} \\ {\begin{array}{*{20}{c}} {H\left( n \right)=Intermefiate\,hops} \\ {packe{t_{size}}=packet\,size} \end{array}} \end{array}} \end{array}} \right.$$
(41)

Definition 2: interpret the data

Interpreting the data from the \(\:{x}_{i}\left(t\right)\) in regards to \(\:{u}_{i}\left(t\right)\) external context, it effects directly on the latency rate and other system parameters to perform for securing the data and the reliability of the network as shown in Figs. 10a–f.

$$\:\gamma\:=Nod{e}_{Reg}\propto\:\{\begin{array}{c}Hash\left({P}_{i}\right)\\\:Sign({P}_{i},{k}_{i})\end{array}$$
(42)
$$\:\gamma\:=Topology\propto\:\{\begin{array}{c}DirectorMulti-hop\end{array}$$
(43)
$$\:\gamma\:=di{s}_{link}\propto\:\{\begin{array}{c}PowerAdjust\end{array}$$
(44)
$$\:\gamma\:={I}_{link}\propto\:\{\begin{array}{c}IDLE\\\:BUSY\\\:RUNNING\end{array}$$
(45)
$$\:\gamma\:=\{\begin{array}{c}{L}_{rate}\propto\:1/Dat{a}_{transmission}\end{array}$$
(46)

Definition 3: optimal solution

Effectively addressing multi-objective non-linear optimization challenges hinges on efficient link scheduling management. Optimizing link scheduling can help to reduce energy consumption, enhance energy efficiency, minimize latency, and improve data transmission quality.

$$\gamma ={L_s}\,\alpha \,\left\{ {\begin{array}{*{20}{c}} {{L_{rate}}\,\alpha \,{\raise0.5ex\hbox{$\scriptstyle 1$}\kern-0.1em/\kern-0.15em\lower0.25ex\hbox{$\scriptstyle {Dat{a_{transmission}}}$}}} \\ {{P_{tx}}\alpha {\raise0.5ex\hbox{$\scriptstyle 1$}\kern-0.1em/\kern-0.15em\lower0.25ex\hbox{$\scriptstyle {NLT}$}}} \\ {Node{\,_E}\,\alpha \,Nod{e_{depl}}} \end{array}} \right.$$
(47)

Likewise, the data transmission affects the attacks on the system:

$$\:\gamma\:=Dat{a}_{transmission}\propto\:\{\begin{array}{c}Attacks\end{array}$$
(48)

In cases where the distance between the source and destination falls below a certain threshold, the transmission power is decreased in order to efficiently transfer data to the destination node and optimize energy consumption. In such cases, not only does the sensor lifetime increase, but the network lifespan also optimizes.

$$\:\gamma\:=\{\begin{array}{c}PowerAdjust\propto\:NLT\end{array}$$
(49)

For energy consumption, we must deliberate that \(\:{E}_{cons}\left(Nod{e}_{i}\right)\) is effected by the equation below:

$$\:{E}_{res}\left({R}_{k}\right)={E}_{init}\left({R}_{k}\right)-{E}_{cons}\left({R}_{k}\right)$$
(50)

by which,

$${E_{cons}}\,\left( {nod{e_i}} \right)\,\,\alpha \,\sum {\left\{ {\begin{array}{*{20}{c}} {Channe{l_{acoustic}}\,=\,Accoustic\,\,channel\,noise} \\ {di{s_{link}}=\,dis\tan ce} \\ {\begin{array}{*{20}{c}} {rat{e_{data}}=data\,\,rate} \\ {tx=transmission\,power} \\ {packe{t_{size}}=\,packet\,\,size} \end{array}} \end{array}} \right.}$$
(51)

Whereas the E2E delay is affected by the transmission and receiving delay with propagation delay However, \(\:di{s}_{link}\), ratedata, tx, packetsize play vital role in the average delay time. Therefore, we use the situation awareness to adapt the tx power w.r.t distance to save energy consumption, network lifetime, and improve E2E delay.

Discussion on result analysis

The findings from our analysis underscore the superior performance of the proposed system compared to established protocols such as GCORP, PER, MARL-MC, and MLAR. These improvements are particularly evident in terms of energy consumption, latency, PDR, and network lifetime, highlighting the robustness and efficiency of the proposed approach in underwater communication networks.

The energy consumption analysis demonstrates that the proposed system achieves a significant reduction in energy usage, maintaining consumption below 120 J per transmission. This is substantially lower than MARL-MC and MLAR, which consume between 150 and 180 J, and GCORP and PER, which exceed 200 J. The system’s reinforcement learning-based routing protocol optimizes energy utilization by dynamically adjusting to changing network conditions and identifying the most energy-efficient routes. This reduction in energy consumption not only extends the operational duration of individual nodes but also enhances the overall longevity of the network, making it particularly suitable for energy-constrained underwater environments.

Latency is another key metric at which proposed system performs significantly better than existing methods. Utilizing a multi-hop communication strategy, it maintains end-to-end delays consistently below 2 s, far outperforming the delays observed in MARL-MC, GCORP, and PER, all of which are over 6 s. The proposed system even outperforms MLAR, with delay between 2 and 3 s. The decrease in latency is due to these queuing and processing mechanisms within the system, which are designed to reduce latency related to packet handling. Furthermore, the proposed system’s ability to balance low latency with reliability makes it a suitable solution for real-time underwater applications including environmental monitoring and disaster response.

The analysis of PDR further reinforces the advantages of the proposed methodology. With a PDR exceeding 90%, the system surpasses all benchmark protocols, including MLAR, which performs at 90%, and MARL-MC and GCORP, which fall below this threshold. The integration of the Value of Information (VoI) concept and robust routing algorithms ensures that high-priority packets are delivered reliably, even in challenging underwater environments. This high delivery ratio is critical for maintaining the integrity and reliability of underwater communication systems, where data loss can have significant consequences.

So it shows the efficiency of this system whereby it enhances the network lifetime. The lifetime achieved for the network is an average of 1850 s, vastly outlasting the second best (MARL-MC) and third best (MLAR) at 1670–1725 s, GCORP and PER at under 1550 s. The reason is that this improvement depends on the energy aware of the system that minimizes wasted resources and improves resource utilization. The extended network lifetime ensures the sustainability of the system, reducing the need for frequent maintenance or redeployment in underwater environments, which are often difficult to access as shown in the Table 2 below:

Table 2 Statistical analysis results.

The proposed algorithm is tested with 3–100 nodes to evaluate its performance in small to medium-sized networks. For larger-scale deployments, the algorithm is designed to scale efficiently by leveraging a multi-hop routing paradigm and distance-aware energy strategies. In larger networks, the multi-hop approach ensures that data can be relayed through intermediate nodes, reducing the energy burden on individual nodes. Additionally, the intelligent node agents dynamically adjust their power levels and routing decisions based on network density and energy availability, ensuring scalability.

The proposed system optimizes resource utilization through several key mechanisms. First, energy-aware routing ensures that nodes dynamically adjust their transmission power based on the distance to the destination and their current energy levels. This approach minimizes energy consumption by avoiding unnecessary high-power transmissions and ensures that nodes with sufficient energy are prioritized for routing tasks. Second, the system incorporates a lightweight blockchain framework, which employs a simplified consensus mechanism to reduce computational overhead. This is particularly important in resource-constrained underwater environments, where computational resources are limited. Third, proactive bandwidth management is implemented through intelligent agents that preemptively manage bandwidth allocation to avoid collisions and interference. These agents continuously monitor network conditions and adjust communication parameters in real-time, ensuring efficient use of available bandwidth. Together, these measures significantly enhance resource utilization, extending the network’s operational lifetime and reducing overall energy consumption. The proposed work also relies on several threshold assumptions and methods to ensure optimal performance. One such threshold is the energy threshold (Ethresh), which determines the minimum energy level required for a node to participate in routing decisions. Nodes with energy levels below this threshold are excluded from routing tasks to prevent premature energy depletion. Another critical threshold is the distance threshold (Dthresh), which dictates whether direct or multi-hop communication is used. If the distance between two nodes exceeds this threshold, the system switches to a multi-hop routing strategy to conserve energy and maintain reliable communication. Additionally, security thresholds are employed for attack detection, such as monitoring signal strength deviations to identify jamming attacks. These thresholds are not static; they are dynamically adjusted based on real-time network conditions, ensuring that the system remains adaptive and responsive to changing environmental and operational demands. This dynamic adjustment of thresholds plays a vital role in maintaining the efficiency and reliability of the network under varying conditions.

The system effectively addresses computational constraints by incorporating several optimization techniques tailored for resource-limited environments. One of the key strategies is the use of a lightweight blockchain, which employs a simplified consensus mechanism to significantly reduce computational overhead while maintaining security and data integrity. Additionally, the system integrates energy-aware routing, allowing nodes to dynamically adjust their power levels based on network conditions, thereby conserving energy and prolonging network lifespan. To further enhance efficiency, the protocol utilizes lightweight cryptographic operations, including optimized hashing techniques and digital signatures, which minimize computational requirements without compromising security. These combined approaches ensure that the system remains both secure and energy-efficient, making it highly suitable for constrained environments like UWSNs.

The system achieves an optimal trade-off by balancing energy efficiency, security, and real-time performance through a carefully designed framework. Adaptive power management plays a crucial role in prioritizing energy efficiency, allowing nodes to dynamically adjust their power levels based on network conditions to extend operational lifespan. At the same time, blockchain-based cryptographic mechanisms ensure robust security, safeguarding data integrity and communication against potential threats. Additionally, the system maintains real-time performance through intelligent routing and proactive bandwidth management, which optimize data transmission and reduce latency. By integrating these elements, the system effectively meets the demanding requirements of UWSNs, ensuring reliability and efficiency in dynamic and resource-constrained environments.

Agents are responsible for both routing and blockchain verification. They balance these tasks by prioritizing routing decisions and performing blockchain operations during idle periods. The packet verification process is optimized to minimize latency. Blockchain verification is performed in parallel with routing operations, and the system compensates for any delays by dynamically adjusting routing paths. The system employs a balanced approach, where energy efficiency and security are equally prioritized. For example, nodes dynamically adjust power levels while maintaining robust security protocols. The blockchain and routing protocol interact dynamically. If an attack is detected, the routing configuration is updated to isolate compromised nodes. The blockchain also influences routing decisions by providing real-time security updates.The reasoning rules are continuously updated using AI-driven learning mechanisms. Agents adapt their behavior based on real-time network conditions, ensuring optimal performance.

Conclusion and future work

The research concludes by introducing an intelligent routing strategy designed to address issues with UWSNs and UANs. The suggested system integrates blockchain technology, an MAS, and acoustic communication to maximize data transfer, reduce energy consumption, and strengthen security. Important problems including bandwidth restrictions, connection interference, security flaws, energy limits, and network lifetime, are all addressed by the proposed method. Routing protocols such as GCORP, PER, MARL-MC, MLAR, and the proposed CI protocol are rigorously evaluated, and the results show that CI performs very well. CI surpasses MARL-MC and MLAR in terms of energy efficiency, with usage continuously falling below 120 J per transmission. The latency data show that CI outperforms both MARL-MC and GCORP, averaging less than two seconds per transmission. Both single-hop and multi-hop scenarios are excellent for CI, with the latter greatly lowering latency. The analysis of the research is expanded to include PDR and network longevity. With more than 90% PDR and a network lifespan of more than 1850 s, CI performs better than GCORP, PER, MARL-MC, and MLAR. These results highlight the usefulness of the CI protocol in improving overall network performance and energy economy in underwater communication systems.

The future scope of this research lies in aligning the proposed framework with emerging 6G paradigms, incorporating AI-driven autonomous communication, edge computing, and quantum-safe cryptography to enhance security and scalability. The system can be extended to integrate energy harvesting techniques, predictive energy management, and real-time analytics for adaptive and efficient underwater operations. Its cross-domain applicability includes hybrid networks combining underwater, aerial, and terrestrial systems for disaster management, climate monitoring, and marine biodiversity tracking. Advancements in MAS can enable complex decision-making and distributed intelligence, while collaboration with hardware developers can optimize acoustic modems and sensors for sustainable, long-term use. Additionally, addressing regulatory, ethical, and open data-sharing considerations will broaden its impact, paving the way for secure, energy-efficient, and intelligent underwater networks that contribute to scientific, environmental, and societal advancements.