Fig. 2: Comprehensive performance evaluation results of G2D-Diff.
From: A genotype-to-drug diffusion model for generation of tailored anti-cancer small molecules

a 2D PCA of condition and drug encodings in the shared space. Each colored point represents a condition (cell line with a desired drug response level), while black points denote drugs. The PC1 axis distinguishes the conditions based on the response class: sensitive, moderate, or resistant. In contrast, the PC2 axis differentiates conditions according to the genotype difference. Blue stars mark breast cancer cell lines with different genotypes in the Moderate response class. Other response class results are shown in Supplementary Fig. 2. b 3D PCA of condition and drug encodings in the shared space. The PC3 provides further stratification within the sensitive and resistant categories, dividing between very sensitive and sensitive, as well as very resistant and resistant conditions. c Natural logarithm of the odds ratio for precision at 5 (circles) and precision at 10 (triangles). The point represents the mean value, and the error bands represent the standard deviation. d–f Distribution of the predicted AUCs for conditionally generated compounds across all conditions in evaluation set 1, evaluation set 2, and evaluation set 3, respectively. Two-tailed Mann–Whitney U test was conducted. g Distribution of the predicted AUCs for conditionally generated sensitive compounds from PaccMannRL (blue) and G2D-Diff (dark green), with the predicted AUCs of ground truth sensitive compounds (green) used as a positive control. h Density of LogP, QED, and SAS for compounds generated from PaccMannRL (blue) and G2D-Diff (dark green), and randomly sampled compounds from ChEMBL (yellow) and NCI60 (orange). Box plots show median (center line), interquartile range (25th and 75th percentiles, box limits), whiskers (e.g., 1.5 times the interquartile range from box limits), and outliers (points outside whiskers). PC principal component, Vsen very sensitive, Sen sensitive, Mod moderate, Res resistant, Vres very resistant, Pre Precision, Eval Evaluation. Source data are provided as a Source Data file.