Fig. 4: Systematic evaluation of drug discovery (AstraZeneca, PubChem) and quantum mechanics (QMugs) datasets in the transductive setting.

The test set performance of high-fidelity models with augmentations based on sum and neural readout-based low-fidelity (‘LF’) models, including fine-tuning (denoted by ‘Tune’, see “Methods” section) is presented. ‘Emb.’ corresponds to the incorporation of low-fidelity embeddings, ‘Pred. lbl.’ corresponds to the predicted labels (outputs) of LF models, ‘Hyb. lbl.’ signifies training with raw labels and evaluating on LF-predicted labels, and ‘readout’ denotes the graph readout function. The AstraZeneca datasets are named based on the high-fidelity (DR, dose-response) and low-fidelity (SD, single dose) datasets. The abbreviations are: AZ AstraZeneca, AID assay identifier, VGAE variational graph autoencoder, MAE mean absolute error, DFT density-functional theory. The remaining results (other datasets) are available in Supplementary Figs. 2 and 5 to 7, with random forest and support vector machine results in Supplementary Figs. 3, 4 and 8 to 11. Source data are provided as a Source Data file.