Fig. 4: Cheminformatic analyses of the diverse PNP collection. | Nature Chemistry

Fig. 4: Cheminformatic analyses of the diverse PNP collection.

From: A divergent intermediate strategy yields biologically diverse pseudo-natural products

Fig. 4

a, A PMI plot for the shape of the PNPs (black circles). The corners of the triangle within the plot indicate a rod-like shape (top left), disk-like shape (bottom middle) and sphere-like shape (top right). The contour lines represent a Gaussian kernel density estimation with ten steps. For a PMI plot with individual PNP subclasses, see Supplementary Fig. 3. b, NP-likeness scores of the PNPs (black curve) compared with the DrugBank compound collection (orange curve), ChEMBL NPs (green curve) and Enamine building blocks (blue curve). c, QED of the PNPs (black curve) compared with the DrugBank compound collection (orange curve), ChEMBL NPs (green curve) and Enamine building blocks (blue curve). d, Box plot of intra- and interclass Tanimoto similarity calculations of Morgan fingerprints (ECFC4) following Tukey’s definitions with outliers83. Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; and points, outliers. The dashed line indicates the 95th percentile median (0.23) of random reference compound subsets. For full cross-similarity values, see Supplementary Figs. 9–10. e, Box plot of intra- and interclass Tanimoto similarity calculations of Morgan fingerprints (ECFP6) following Tukey’s definitions with outliers83. Centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; and points, outliers. The dashed line indicates the 95th percentile median (0.17) of random reference compound subsets. For full cross-similarity values, see Supplementary Figs. 11 and 12. The number of compounds in reference sets is 527,411 (50,000 random compounds were selected for PMI analysis) for Enamine, 4,866 for DrugBank and 45,679 for ChEMBL NPs.

Back to article page