Scientific Reports

Table 7 Comparison of the performance between machine learning models that integrate missing data and those that do not.

From: HMLA: A hybrid machine learning approach for enhancing stroke prediction models with missing data imputation techniques

Aspect	With missing data handling	Without missing data handling
Data size and completeness	Maintains the size of the data by imputing missing values, employing the complete information for training	Rows containing values that are missing are frequently eliminated, hence affecting the amount of sample and information
Bias and Variance	Lowered bias, since imputed data aids in preserving stability and preventing distortion of the parameters of the model	Exclusion of rows or columns may result in significant bias, producing an unrepresentative sample
Impact on Feature Relationships	Imputation maintains inter-feature interactions, resulting in stronger and consistent models	Distorting correlations occur when significant characteristics are missing values, resulting in unreliable predictions
Algorithm Compatibility	Most machine learning methods can be efficiently employed with imputed input	Some approaches (e.g., linear models, neural networks) are incapable of directly accommodating missing values
Computational Efficiency	Imputation methods, such as KNN and MICE, can be highly computational, impacting scalability	Models could show superior computing speed but demonstrate a deficiency in performance stability
Practical Application	Appropriate for sensitive domains (e.g., healthcare) where data integrity is essential for safety	Insufficient for delicate applications; skewed systems may result in significant inaccuracies
Model Interpretability	Models retain interpretability by precise imputation that preserves the structure of the data	Interpretability is compromised by the absence of context and imperfect correlations
Overall Model Performance	Generally superior performance regarding precision, reliability and stability	Unreliable and inconsistent performance resulting from insufficient learning and biases

Back to article page

Search

Advanced search

Quick links