Extended Data Fig. 2: The distribution of correlations between the predicted and actual gene expression values across the cohort samples.

The violin plots depict the correlations between the predicted and measured expression values across the cohort samples obtained by HE2RNA (light pink) and DeepPT (light blue) for all genes (a), the top 1,000 genes (b), the top 2,000 genes (c), and the top 3,000 genes (d) with the highest correlations. The results presented in this figure were measured by the mean of 5 folds, as reported in44. Except for this figure, all other results presented in this study were measured across the entire test samples, consistent with the approach used in49. P-values were calculated using the one-sided Mann-Whitney U test. In violin plots, the central mark is the median. The number of patients in each cohort is shown in parentheses.