Fig. 4: Predictions of Permeability at various train test splits.

a Coefficient of determination (R2). b Order of magnitude error (OME). R2 evaluates the predictive performance of a model, whereas OME measures the prediction error by considering orders of magnitude, represented as the logarithm of the mean absolute error. The ST and MT models are compared based on varying percentages of the unseen test set. The different test set sizes illustrate the impact of reducing training data. At 80%, the model is trained on only 20% of the dataset and tested on the remaining 80%, reflecting a data-scarce region with limited chemical coverage. Comparatively, the MT models show significant improvement over the ST model, particularly at higher percentages of the unseen test set.