Table 3 Stability of feature selection through resampling utilizing the Information Gain Ratio (IGR). The table presents the average IGR value and selection frequency for each feature over 100 bootstrap resampling iterations. Increased selection frequencies signify enhanced feature stability and heightened significance in stroke-risk prediction within the CHS dataset.

From: HMLA: A hybrid machine learning approach for enhancing stroke prediction models with missing data imputation techniques

Attributes

Description

Mean IGR

Selection Frequency (%)

AGE

AGE IN 2-YEAR CATEGORIES

0.69451

98%

GENDER

GENDER

0.00215

63%

MARIT

MARITAL STATUS

0.09215

92%

HEALTH

GENERAL HEALTH

0.01342

74%

HIBP

HIGH BLOOD PRESSURE

0.5492

82%

DIABETES

CALC. DIAB STATUS

0.06493

85%

SMOKE

SMOKING STATUS

0.01145

71%

STEPS

HAVE DIFFICULTY WALKING

0.00514

65%

EXINTEN

EXERCISE INTENSITY

0.02648

75%

OVRWT

OBESITY

0.05941

83%

HRATE

HEART RATE

0.03841

78%

ALCOH

WEEKLY ALCOHOL CONSUMPTION

0.03248

78%

Hibsug

HIGH BLOOD SUGAR STATUS

0.06187

83%

BMI

BODY MASS INDEX

1.2457

94%

HYPER

CALC. HTN STATUS

0.02518

74%