Fig. 2

Overview of the database management and preprocessing workflow. (a) Heatmap indicating the presence and distribution of zero values across the dataset, used to assess data sparsity and potential quality issues. (b) Bar plots of soil variable values prior to data cleaning, revealing outliers and irregularities. (c) Boxplot illustrating the cleaned dataset, with improved data consistency and reduced variability after preprocessing. (d) Histogram of reconstruction errors, representing the accuracy and reliability of the data imputation and reconstruction step.