Table 1 Comparison of statistics before and after data cleaning.
From: Predictive model on employee stock ownership impacting corporate performance
Variable name | Raw data mean | Raw data std. dev. | Mean (Cleaned) | Std. dev. (Cleaned) | Processing method |
---|---|---|---|---|---|
ESOP_RATIO | 0.048 | 0.250 | 0.044 | 0.230 | Winsorize (99% quantile cutoff) |
Social_Sentiment | 0.65 | 0.502 | 0.62 | 0.480 | Excluding extreme values (|Score|\(>2\)) |
ROA | 0.024 | 0.084 | 0.023 | 0.081 | First-order difference detrending |
Debt Ratio | 0.434 | 0.203 | 0.430 | 0.198 | Deletion of samples with \(>30\%\) missing rate |
Lock Period | 24.5 | 12.3 | 24.1 | 11.8 | Linear interpolation of missing values |