Scientific Reports

Table 1 Hyperparemeters configuration of the M2ACD.

From: Deep reinforcement learning trajectory planning for robotic manipulator based on simulation-efficient training

Parameter	Value
Experience replay buffer	1000
Batch	16
Learning rate	\(1\times 10^{-2}\)
Network update rate	100
Exploration rate	\(1\times 10^{-2}\)
Epsilon decay	10

Back to article page

Search

Advanced search

Quick links