Fig. 4: Using the model quadruples the hit rate.
From: Predictive design of crystallographic chiral separation

A Enrichment factor as a function of the number of selected top predictions, estimated using a 5-fold cross-validation experiment where each fold corresponds to unseen chiral salts. The blue line corresponds to encoding all of the participating molecules as random numbers, the orange line corresponds to encoding the molecular structures with Morgan fingerprints, the red line corresponds to another deep learning approach showing state-of-the-art results for chiral chromatography51, and the green line corresponds to this work. The shaded area indicates the standard deviation estimated from an ensemble of five models. Note that the 8% base rate is the proportion of hits in the lower-noise subset of the data. B The model trained with the two-step approach outperforms the models directly trained on all training data, either as a classifier or a regressor. C Model performance, measured by average precision, systematically improves as more training data are added. Each point represents the mean performance of an ensemble of 10 models, and the error bars correspond to the standard deviation of the ensemble performance.