Table 5 Parameter settings.
Parameter | Value/configuration |
|---|---|
Learning rate (Actor) | 3 × 10⁻⁴ |
Learning rate (Critic) | 1 × 10⁻³ |
Learning rate (fuzzy layer) | 1 × 10⁻³ |
Discount factor (γ) | 0.97 |
Entropy coefficient (β) | 0.002 |
Batch size | 64 |
Max steps per episode | 1000 |
Number of episodes | 150 |
Optimizer | Adam |
Hidden layers (actor/critic) | 2 × 256 neurons |
Activation function | ReLU (hidden), SoftMax (output) |
Reward weights (α₁, α₂, α₃, α₄) | (0.35, 0.25, 0.25, 0.15) |
Fuzzy rules | 81 (adaptive Takagi–Sugeno) |