Fig. 6: Validating and interpreting AI-Bind predictions.
From: Improving the generalizability of protein-ligand binding predictions with AI-Bind

a Distribution of binding affinities for top and bottom 100 predictions made by AI-Bind’s VecNet over viral and human proteins associated with COVID-19. We ran docking on top 84 predictions and bottom 44 predictions. We observe that the top binding predictions (blue) of AI-Bind show lower binding energies (better binding) compared to the bottom predictions (orange). Considering the binding threshold of −1.75 kcal mol−1, 88% of the top predicted pairs by AI-Bind are inline with the docking simulations. b We construct the confusion matrix for the top and the bottom predictions from AI-Bind. We obtain the true labels using the threshold of −1.75 kcal mol−1 (gray dashed line) on the binding affinities from docking. We observe that AI-Bind predictions produce excellent F1-Score, offering predictions significantly better than random selection. c Binding probability profile for the human protein Trim59. Multiple valleys in the profile directly map to the amino acid residues to which the ligands bind and are indicative of the active binding sites on the amino acid sequence. We identify the valleys on the binding probability profiles for three ligands Pipecuronium, Buprenorphine and Voclosporin, which bind at different pockets on Trim59. Valleys for these pockets have been mapped back to the amino acid sequence (valleys 1A, 1B, 1C, 1D, and 1E for pocket 1, valleys 2A and 2B for pocket 2, and valleys 3A and 3B for pocket 3). Furthermore, we highlight the secondary structure of Trim59 obtained from the amino acid sequence. Valleys containing the β-pleated sheets and the coils are more prone to binding compared to the ones with the α-helices52,53,54,55. Combining the binding probability profile and the secondary protein structure allows us to identify active binding sites, guiding the design of an optimal search grid for docking simulations. Source data are provided as a Source Data file.