Fig. 3: Summarizing the error distributions in the experimental survey and FEP benchmark.
From: The maximal and current accuracy of rigorous protein-ligand binding free energy calculations

a Boxplots comparing the root-mean-square error (RMSE) between relative binding free energies from different experimental assays (purple) and the FEP+ predictions against experimental data (green). The top and bottom of the boxes represent the 25th and 75th percentiles and the dark line represents the median. The whiskers extend to a maximum of 1.5 times the interquartile range. Circles are the RMSEs from comparing two experimental assays (green) or the RMSEs of a FEP+ perturbation graph (purple). The size of each circle is proportional to the number of ligands in the series in either an assay comparison or perturbation graph. The two largest data points in the experimental survey are from the COVID moonshot project68 and project A from Supplementary Table 4. The median RMSE in the experimental survey is 0.85 kcal mol−1 and the median in the FEP+ benchmark is 1.08 kcal mol−1. b All pairwise relative binding free energy differences from the experimental survey and all pairwise FEP+ errors. The histograms were symmetrized about the x=0 line in the sense that all N × (N − 1) pairs of compounds were used. The error distributions are bell-shaped and can be approximated by t-distributions.