Fig. 9: Analysis of exploration strategies in model-based RL for the 5-dimensional Ackley function.

a Impact of different epsilon values (ε = 0.05–0.9) during the design stage, where higher ε indicates more random exploration. b Effect of batch size (1, 2, 4, 8, 16) on optimization performance, investigating the trade-off between parallel experimentation and learning efficiency. In both plots, the y-axis shows the best-so-far values (lower is better) over the number of conducted experiments, with the global optimum indicated by the dashed line.