Fig. 4: Information theoretical quantities correlate with error and chemical trends in the TM23 dataset. | Nature Communications

Fig. 4: Information theoretical quantities correlate with error and chemical trends in the TM23 dataset.

From: Model-free estimation of completeness, uncertainties, and outliers in atomistic machine learning using information theory

Fig. 4: Information theoretical quantities correlate with error and chemical trends in the TM23 dataset.

a Information entropy of the full TM23 training set for each element47. b Force errors (in %) for NequIP models21 trained on the full training set, obtained from Owen et al.47. c These two quantities exhibit strong correlation, as indicated by the Pearson correlation coefficient of ρ = 0.79 for transition metals with incomplete d-shell. d The difference between the final forces error (in %) and the initial forces errors (in %) (denoted as ΔError) is explained by the dataset overlap obtained by computing the differential entropies \(\delta {{{\mathcal{H}}}}(0.75{T}_{m}| 0.25{T}_{m})\) or \(\delta {{{\mathcal{H}}}}(1.25{T}_{m}| 0.25{T}_{m})\), as demonstrated by the Pearson correlation coefficient of ρ = − 0.85. Red and blue dots indicate error differences for models trained on the “cold” subset of TM23 (sampled at 0.25 Tm, where Tm is the melting temperature) and tested on the “warm” (0.75 Tm) and “melt” (1.25 Tm), respectively.

Back to article page