Table 2 Table summarizes the impact of feature cluster removal (i.e., based on their respective linkage distances) on the predictive performance of the LGBM model

From: Machine learning models to accelerate the design of polymeric long-acting injectables

No. of features

Mean absolute error (n = 10)

Standard deviation (n = 10)

Wards linkage distance

17

0.116

0.018

0.00

15

0.116

0.017

0.06

13

0.142

0.017

0.12

12

0.143

0.017

0.24

11

0.143

0.018

0.29

10

0.143

0.019

0.35

9

0.139

0.022

0.53

8

0.150

0.021

0.76

5

0.296

0.023

0.82

4

0.296

0.024

0.88

  1. The performance of the LGBM model with various numbers of input features was assessed by comparing the average and standard deviation of the AE values obtained from a series of trials (n = 10 trials) that randomly grouped 20% of the drug–polymer combinations as a holdout test set.