Fig. 5: Model performance of drug response prediction.

a The Pearson’s correlation coefficients within a group of sensitive compounds and the Pearson’s correlation coefficients between sensitive and resistant compounds based on TranSiGen-derived representation. Box-and-whisker plots show the median (center line), 25th, and 75th percentile (lower and upper boundary), with 1.5 × inter-quartile range indicated by whiskers and outliers shown as individual data points. The one-sided Mann–Whitney test was used to analyze the data. The exact p values and sample sizes are in source data. b The Tanimoto similarity within a group of sensitive compounds and the similarity between sensitive and resistant compounds based on molecular fingerprint ECFP4. Box-and-whisker plots show the median (center line), 25th, and 75th percentile (lower and upper boundary), with 1.5 × inter-quartile range indicated by whiskers and outliers shown as individual data points. The one-sided Mann–Whitney test was used to analyze the data. The exact p values and sample sizes are in source data. c Performance of predicting drug response using various type of representations. All models were run five times with different random seeds. Black dots indicate the corresponding data points, and error bars represent the mean ± standard deviation. Two-sided t-test was applied between the models, and the exact p values are in source data. d Ranking results of compounds by AUCspred of models based on various type of representations. Source data are provided as a Source Data file. (****p < 0.0001; ***0.0001 < p ≤ 0.001; **0.001 < p ≤ 0.01; *0.01 < p ≤ 0.05 and ns, 0.05 < p ≤ 1.0).