Table 2 Comparison of the optimized structures in the gas phase for different datasets

From: Accelerating reliable multiscale quantum refinement of protein–drug systems enabled by machine learning

 

QR50a

PB20-QM-3ka

PB20-QM-8ka

 

AIQM1b,c

ANI-2xc

xTB

AIQM1b,c

ANI-2xc

xTB

ANI-2x

xTB

All systems

 

(50)d

 

(3156)d

(8776)d

 Bond (Å)

0.005

0.006

0.008

0.004

0.003

0.006

0.003

0.006

 Angle (°)

0.6

0.9

0.8

0.4

0.5

0.6

0.6

0.7

 Dihedral (°)

11.6

16.1

11.2

26.7

32.0

28.0

32.6

29.0

Neutral group(s)

 

(24)d

 

(3061)d

(7260)d

 Bond (Å)

0.004

0.003

0.007

0.004

0.003

0.006

0.003

0.006

 Angle (°)

0.5

0.6

0.7

0.4

0.4

0.6

0.5

0.7

 Dihedral (°)

10.6

12

10.6

27.2

32.5

28.6

35.2

31.6

Charged group(s)

 

(26)d

 

(95)d

  

(1516)d

 Bond (Å)

0.006

0.009

0.010

0.005

0.004

0.005

0.004

0.006

 Angle (°)

0.7

1.3

0.9

0.6

0.8

0.8

0.8

0.8

 Dihedral (°)

12.7

21

11.9

18.2

25.0

18.5

25.9

21.5

  1. aQR50: our selected 50 drugs/inhibitors (Fig. 1b and Supplementary Figs. 14); PB20-QM-3k: the smallest subset dataset of PB20-QM containing 3156 drugs/inhibitors (3125 molecules containing C, H, O and/or N elements, 15 molecules containing F, Cl and/or S elements, and 16 molecules containing B, P, Se, Br and/or I elements); PB20-QM-8k: a smaller subset dataset of PB20-QM containing 8776 drug/inhibitors containing C, H, O, N, F, Cl and/or S elements.
  2. bONIOM(MLP:ANI-2x) method was used for the molecules containing F, Cl and/or S elements, when AIQM1 was used.
  3. cONIOM(MLP:SE) method was used for the molecules containing B, P, Se, Br and/or I elements, when AIQM1 or ANI-2x was used.
  4. dNumber of the drugs/inhibitors.
  5. Median absolute deviation (MAD) of the optimized bond distances, angles, rotatable dihedrals in the gas phase using AIQM1, ANI-2x, and GFN2-xTB (xTB) methods compared to the density functional theory (DFT, ωB97X-D/6-31 G(d)) method for the QR50, PB20-QM-3 k, and PB20-QM-8 k datasets.