Fig. 3: Heterogeneity metric between datasets: M\T, M ∩ T and T\M. | npj Computational Materials

Fig. 3: Heterogeneity metric between datasets: M\T, MT and T\M.

From: Machine learning on multiple topological materials datasets

Fig. 3

The heterogeneity from dataset A (y-axis) to dataset B (x-axis) was computed as the average distance from each point in A to its 5-nearest neighbors in B, using a feature space defined by the top 47 Matminer features (44 continuous, 3 discrete). These features were selected based on their importance in training XGBoost models. Larger distances indicate higher dissimilarity, revealing the compositional differences between the datasets.

Back to article page