Table 1 Comparison of statistics before and after data cleaning.

From: Predictive model on employee stock ownership impacting corporate performance

Variable name

Raw data mean

Raw data std. dev.

Mean (Cleaned)

Std. dev. (Cleaned)

Processing method

ESOP_RATIO

0.048

0.250

0.044

0.230

Winsorize (99% quantile cutoff)

Social_Sentiment

0.65

0.502

0.62

0.480

Excluding extreme values (|Score|\(>2\))

ROA

0.024

0.084

0.023

0.081

First-order difference detrending

Debt Ratio

0.434

0.203

0.430

0.198

Deletion of samples with \(>30\%\) missing rate

Lock Period

24.5

12.3

24.1

11.8

Linear interpolation of missing values