Table 5 Key experimental parameters used in SAC-based power management.
Parameter | Purpose | Value |
|---|---|---|
DRX timeout | Time in inactivity before DRX mode | 2 s |
PSM entry delay | Minimum idle duration to enter PSM | 10 s |
Learning rate | Actor/Critic optimizer learning rate | 0.0003 |
Discount factor (\(\gamma\)) | Future reward decay | 0.99 |
Batch size | Number of transitions per training update | 256 |
Max episodes | Training duration | 1000 episodes |
Sleep mode preference | Biasing probability toward eDRX and PSM | eDRX: 0.7, PSM: 0.3 |