Fig. 1: Comparison of the distribution of the cell count feature (“Cells_Number_Object_Number”) across.
From: Counting cells can accurately predict small-molecule bioactivity benchmarks

a 270 assays in the Moshkov et al.13 (p < 2.2e-16 stat = −8.572e+00), b 209 assays in the Hofmarcher et al.12 (p < 2.2e-16 stat = −1.166e+01; which was also analyzed in Sanchez-Fernandez et al.14), and c 201 assays in the Ha et al.15 (p < 2.2e-16 stat = −1.156e+01), split by active and inactive compounds and balanced to the same number of active and inactive compounds per assay. P values were obtained using a two-sided independent-samples t-test; values beyond machine precision are reported as p < 2.2e-16. Number of assays with balanced accuracies >0.70 when predicting assay outcomes in the d Moshkov dataset, e Hofmarcher dataset, and the f Ha dataset, by thresholding the normalized cell count feature at various cutoffs. g Relationship between compound promiscuity and cell count deviation. Compounds were grouped based on the number of assays in which they were active, using data from both the Moshkov and Hofmarcher datasets. Boxes represent the interquartile range (IQR; 25th–75th percentile), the horizontal line indicates the median, and whiskers extend to 1.5× IQR. Individual points represent single compounds. Red dots indicate the mean cell count. Error bars represent mean ± SD.