Table 2 Hyperparameter configurations.

From: Design of a dynamic trust management and defense decision system for shared vehicle data based on blockchain and deep reinforcement learning

Parameter

Value

Parameter

Value

Discount factor (γ)

0.99

Batch size

32

Learning rate (α)

0.001

Training rounds

150

ε (Initial)

1.0

State space dimension

100

εmin

0.01

Action space dimension

4

εdecay

0.995

Target network update frequency

50