Table 1 Algorithm 1. Reinforcement Learning Based Secure Routing Protocol (RLSRP).

From: An adaptive, energy-efficient and secure routing protocol for zone-related mobile Ad-hoc networks using reinforcement learning

Input: Nodes V, edges E, cluster heads CH, maximum hops \(k_{\text {max}}\), communication range R, transmission range TR, set of malicious nodes M, Q-table Q(s, a), exploration rate \(\epsilon\), weights \(w_1, w_2, w_3\).

Output: Secure routing path \(P_{\text {secure}}\)

Phase 1: Adaptive k-Hop Zone Formation

1. For each node \(v_i \in V\), compute the adaptive k-hop cluster using \(k_{\text {max}}\) and range R.

2. Assign nodes to clusters \(C_k^j = \{ v_i \mid d(v_i, CH_j) \le k(v_i), v_i \in Z_j \}\).

3. For each cluster \(C_k^j\), select a cluster head using:

\(CH_j = \arg \max _{v_i \in C_k^j} \left( \frac{E(v_i)}{d(v_i, \text {centroid})} \right)\)

Phase 2: Q-Value Update using DQN

1. Initialise the Q-table Q(s, a) randomly.

2. For each episode and for each node \(v_i\):

\(\bullet\) Observe current state \(s_t\).

\(\bullet\) Select action \(a_t\) using \(\epsilon\)-greedy strategy.

\(\bullet\) Execute \(a_t\) and observe reward \(r_t\) and next state \(s_{t+1}\).

\(\bullet\) Update Q-values using:

\(Q(s_t, a_t) \leftarrow (1 - \alpha ) Q(s_t, a_t) + \alpha \left[ r_{t+1} + \gamma \max _{a} Q(s_{t+1}, a) \right]\)

\(\bullet\) If \(\epsilon> \epsilon _{\text {min}}\), decrease \(\epsilon\).

Phase 3: Wormhole Attack Detection

1. For each node \(v_i\):

\(\bullet\) If latency exceeds mean plus one standard deviation, i.e., \(\text {Latency}(v_i)> \mu + \sigma\), mark \(v_i\) as suspicious.

\(\bullet\) If \(v_i \in M\), mark \(v_i\) as malicious.

Phase 4: Energy-Efficient and Secure Route Selection

1. For each packet transmission:

\(\bullet\) For all possible paths \(P \in \mathscr {P}\), compute:

\(\text {Score}_P = \sum _{i=1}^n \left( w_1 Q(s_i, a_i) - w_2 D(i) + w_3 E(i) \right)\)

\(\bullet\) Select the secure path:

\(P_{\text {secure}} = \arg \max _{P \in \mathscr {P}} \text {Score}_P\)

\(\bullet\) If \(D(i) < \text {Threshold}\) for all i on the path, forward the packet through \(P_{\text {secure}}\).

\(\bullet\) Else, discard the path.