Fig. 6: Ablation study of the proposed PSE method.
From: Discovering physical laws with parallel symbolic enumeration

a, Comparison of token generators: random, MCTS and GP. b, Ablation of token constants range. c, Ablation of DR mask on two different PSE configurations: (c.1), a 4-input, 3-layer PSRN with \({{\mathscr{O}}}_{{\rm{Koza}}}\) library (for example, {+, ×, −, ÷, identity, sin, cos, exp, log}); and (c.2), a 5-input, 3-layer PSRN with \({{\mathscr{O}}}_{{\rm{SemiKoza}}}\) library (for example, {+, ×, SemiSub, SemiDiv, identity, neg, inv, sin, cos, exp, log}). d–f, SRBench46 results with GP as the token generator. Our method achieves the highest symbolic recovery rate (d) across all noise levels (that is, 0, 0.1%, 1% and 10% of the data’s standard deviation) while maintaining competitive performance in model complexity (e) and training time (f). g, Expression search efficiency of the PSRN module. h, Space complexity of PSRN with three symbol layers, with respect to the number of input slots and memory footprints. We incorporate four operator sets: \({{\mathscr{O}}}_{{\rm{Koza}}}\), \({{\mathscr{O}}}_{{\rm{SemiKoza}}}\), \({{\mathscr{O}}}_{{\rm{Arithmetic}}}\) (for example, {+, ×, −, ÷, identity}) and \({{\mathscr{O}}}_{{\rm{BasicKoza}}}\) (for example, {+, ×, identity, neg, inv, sin, cos, exp, log}) for comparison. With the expansion of the memory footprint, our model demonstrates scalability by evaluating a greater number of candidate expressions in a single forward pass, leading to enhanced performance. We strictly maintain consistent runtime budgets. The variations in the algorithms’ runtime occur because different algorithms trigger their respective early stopping conditions (for example, reaching an MSE below a very small threshold, or discovering symbolically equivalent expressions, and so on). The error bars represent the 95% confidence interval of the metrics. The results are averaged across 20 independent runs for each SR problem.