Table 2 simulation parameters.

From: A cognitive internet of things resource allocation method based on multi-agent reinforcement learning algorithm

Parameter

Value

Bandwidth B

10 MHz

Number of primary users M

6

Number of vehicle user pairs N

12

Vehicle speed v

50 km/h

Transmit power of primary user \(P_p\)

2 W

Maximum transmit power of vehicle user pair transmitter \(P_{\text {max}}\)

1 W

Time slot T

100 ms

Sensing phase duration \(\tau _1\)

20 ms

Decision phase duration \(\tau _2\)

20 ms

Transmission phase duration \(\tau _3\)

60 ms

Noise power \(\delta ^2\)

\(1 \times 10^{-8}\) W

Initial learning rate \(\alpha\)

0.001

Target network decay \(\epsilon\)

0.01

Discount factor \(\gamma\)

0.99

Number of training episodes \(E_{\text {max}}\)

500

Iterations per episode \(T_{\text {num}}\)

1000