Table 4 Impact of synthetic data generation (top models with selected sample sizes and SOC ranges) on the skewness and kurtosis of the training data.
From: Enhancing soil organic carbon estimation with generative AI and Nix color sensor
Dataset | Skewness | Kurtosis |
|---|---|---|
Training data | 2.80 | 8.85 |
Training + GMM (5000 samples, 3–7%) | −0.61 | 0.75 |
Training + KNN (4000 samples, 3–9%) | −0.38 | −0.40 |
Training + bootstrap (1000 samples, 3–4%) | 0.76 | 6.59 |