Extended Data Fig. 3: QSAR modeling framework to predict binding lead compounds. | Nature Chemical Biology

Extended Data Fig. 3: QSAR modeling framework to predict binding lead compounds.

From: Designing small molecules targeting a cryptic RNA binding site through base displacement

Extended Data Fig. 3: QSAR modeling framework to predict binding lead compounds.The alternative text for this image may have been generated using AI.

a, Written out multiple linear regression (MLR) baseline, R2-focused, and Q2-focused models. The physical meaning of each physiochemical descriptor is listed in Supplementary Table 2. b, Measured ln [KD] values plotted with the value predicted by the baseline model that did not include a data split. c, Locations of the training and test set from the R2-focused modeling in two-dimensional chemical space constructed from PC1 and PC2 of the whole data set. d, Same as in b but for the R2-focused model. e, Average ln [KD] values predicted by our three models for each compound in the alkyne library. The magenta shaded box represents the 184 compounds that have a predicted affinity tighter than binding lead 29 (ln [KD] < −18.0). f, Rank-ordered plot the average KD values predicted by our binding models for the 184 potential lead compounds that were identified from the alkyne library, separated by type (phenyl-X and amide-X). These compounds are classified as either “out of stock”, “already have”, “counterevidence”, “difficult synthesis”, and “will test”. Here, “out of stock” refers to alkynes that were not purchasable from Enamine. “Counterevidence” refers to compounds that are chemically similar to derivatives in our library that are known to have weak affinity. “Difficult synthesis” refers to compounds that contain amino groups or protonated nitrogen atoms, which were not compatible with the reduction-free synthesis25,26 (Supplementary Fig. 2). The vast majority of “out of stock” compounds also contained amino groups or protonated nitrogen atoms. Thus, even if they become purchasable from Enamine, they are still synthetically inaccessible. Finally, “will test” refers to the four purchasable alkynes that were predicted to yield binding lead Cbls 45-48.

Source data

Back to article page