Table 3 The setup of hyper-parameters in gplearn for GPSR.

From: Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts

Parameter

Value

population size

5000

Generations

20

stopping criteria

0.01 (eV)

pc

0.5, 0.95 (step = 0.025)

ps

(1-pc)/3, (0.92-pc)/3 (step = 0.01)

ph

ps

pp

1-pc-ps-ph

function set

add, sub, mul, div, sqrt

parsimony coefficient

0.0005, 0.0015 (step = 0.0005)

tournament size

20

metric

mean absolute error (MAE)

constant range

(−1,1)

  1. The explanation of each hyperparameter are as follows: population size the number of mathematical formulas in each generation, generations the max number of generations, stopping criteria MAE value that the program stops, pc crossover probability, ps subtree mutation probability, ph hoist mutation probability, pp point mutation probability, function set basic building blocks containing mathematical operators, parsimony coefficient a constant that penalizes large individuals by adjusting their MAE to make them less favorable for selection, tournament size the number of individuals in each tournament, metric measures how well an individual fits, constant range the range of constants included in mathematical formula.