Fig. 2
From: Crowd-sourcing materials-science challenges with the NOMAD 2018 Kaggle competition

A comparison of the distribution of the absolute errors for the training set (blue) and test set (red) of the formation energy (Ef, left) and bandgap energy (Eg, right) from the three winning representations (4-gram, c/BOP, and SOAP) of the competition combined with the KRR/GPR, NN, and LGBM regression models. The mean absolute errors (MAE) of the test set (orange cross) and training set (orange filled circle) are provided. Boxplots are included for each training and test-set distribution to indicate the 25, 50, and 75% percentiles of the absolute errors. The box and violin plots only extend to the 95% percentile. For the training-set predictions, the maximum absolute error in the formation (bandgap) energy for 4-gram + KRR, c/BOP + LGBM, and SOAP + NN is 103 meV/cation (1047 meV), 185 meV/cation (606 meV), and 376 meV/cation (497 meV), respectively. The corresponding maximum absolute test errors are 282 meV/cation (1112 meV), 276 meV/cation (1680 meV), and 286 meV/cation (1198 meV) for the 4-gram + KRR, c/BOP + LGBM, and SOAP + NN models, respectively