Fig. 2: Error evaluation for the presented models.

a Comparison of the predicted \({\kappa }^{{{{\rm{SISSO}}}}}\left({{{\rm{300\,K}}}}\right)\) against the measured \({\kappa }_{{{{\rm{L}}}}}\left({{{\rm{300\,K}}}}\right)\) for the model trained against all data. The gray shaded region corresponds to the 95% confidence interval. b Violin plots of the mean prediction error of all samples for the SISSO, KRR, and GPR models using all features (red, left) and a reduced set including only σA, ΘD,∞, and Vm (blue, right) and the Slack model. Gray lines are the median, white circles are the mean of the distributions, the boxes represent the quartiles, and the whiskers are the minimum and 95% absolute error. For all calculations the parameterization depth and dimension are determined by cross-validation on each training set. The red stars and blue hexagons are the outliers for the box plots. c A map of the two-dimensional SISSO model, where the features on the x − and y − axes correspond to the two features selected by SISSO. The labeled points represent the convex-hull of the scatter plot and related points.