Fig. 3: PDGrapher efficiently predicts chemical perturbagens to shift cells from diseased to treated states in held-out folds containing new samples.
From: Combinatorial prediction of therapeutic perturbations using causally inspired neural networks

a,b, PDGrapher shows improved performance across nine chemical perturbation datasets with various diseases, yielding up to 13.37% more accurately predicted samples in the testing sets compared with the second-best model (for example, for chemical-PPI-breast-MDAMB231, 20.43% versus 7.05% (a)) and up to 0.13 higher nDCG than the second-best model (for chemical-PPI-breast-MDAMB231, 0.31 versus 0.18 (b)). In a and b, the bars show the average performance across five cross-validation test splits for each of the nine chemical datasets. The overlaid points represent performance values from individual data splits (n = 5 per cell line). Each data split contains 20% of samples in the dataset, with each sample corresponding to a perturbation-response instance. Where replicates exist for a given drug, they are treated as independent inputs during training and evaluation. c, PDGrapher recovers ground-truth therapeutic targets at higher rates (evaluated by recall 1–100) compared with competing methods for chemical-PPI datasets. d, Box plots show the distribution of average model rankings across 9 cell lines (n = 9); each dot corresponds to the aggregated ranking value across cross-validation splits and across all metrics for a distinct cell line. A higher value indicates better performance. The central line inside the box represents the median, while the top and bottom edges correspond to the first and third quartiles. The whiskers extend to the smallest and largest values within 1.5× the interquartile range from the quartiles. Each dot represents a data point for a specific cell line. P values from the statistical tests are provided in the Source data. e, Shown is the difference of shortest-path distances between ground-truth therapeutic genes and predicted genes by PDGrapher and a random reference across nine cell lines. Predominantly negative values indicate that PDGrapher predicts sets of therapeutic genes that are closer in the network to ground-truth therapeutic genes compared with what would be expected by chance (average shortest-path distances across cell lines for PDGrapher versus reference = 2.77 versus 3.11).