Fig. 4: Interpreting the ML models.

a Pearson correlation coefficient of the ten features that survived the feature reduction. The coefficient value is encoded in the color while the circle radius encodes the absolute coefficient value. Vis, ais and dis represent the volume, area, and distance interstices, respectively, and the symbol before brackets, i.e., Std, Mean, Std, Min, denotes the statistics of these interstices in the nearest-neighbor (SRO) environment. If there is MROstat in the feature name, this means that the SRO feature has been coarse-grained among neighbors, i.e., taking the statistics, as denoted by the subscript of MRO, among neighbors. The subscript “dist” in dis-dist indicates that the neighbors are determined by a cutoff distance rather than by the default Voronoi tessellation. b Feature importance of the ML models trained in the H-Eact and L-Eact tasks. The feature importance is averaged over models obtained from the three times of data undersampling and five-fold cross-validation in each data sample.