Table 1 Benchmarking six different embedding models on six regression property prediction tasks with MAEs and R2-scores
From: Leveraging language representation for materials exploration and discovery
Composition embedding | Structure embedding | ||||||
|---|---|---|---|---|---|---|---|
Property | Metric | Mat2Vec (Baseline) | MatSciBERT | MatBERT | Fingerprint (Baseline) | MatSciBERT | MatBERT |
E/atom | MAE | 0.47 ± 0.02 | 0.42 ± 0.01 | 0.37 ± 0.01 | 1.13 ± 0.02 | 0.32 ± 0.02 | 0.29 ± 0.03 |
|  | R2 | 0.81 ± 0.02 | 0.86 ± 0.01 | 0.88 ± 0.01 | 0.283 ± 0.02 | 0.95 ± 0.01 | 0.96 ± 0.01 |
Eg | MAE | 0.15 ± 0.01 | 0.20 ± 0.02 | 0.19 ± 0.01 | 0.54 ± 0.03 | 0.25 ± 0.01 | 0.23 ± 0.01 |
|  | R2 | 0.92 ± 0.02 | 0.88 ± 0.02 | 0.88 ± 0.01 | 0.45 ± 0.04 | 0.88 ± 0.01 | 0.89 ± 0.01 |
log_K | MAE | 0.18 ± 0.01 | 0.18 ± 0.01 | 0.17 ± 0.01 | 0.45 ± 0.01 | 0.16 ± 0.01 | 0.15 ± 0.01 |
|  | R2 | 0.83 ± 0.01 | 0.83 ± 0.03 | 0.85 ± 0.02 | 0.26 ± 0.02 | 0.90 ± 0.01 | 0.93 ± 0.01 |
log_G | MAE | 0.20 ± 0.01 | 0.23 ± 0.01 | 0.22 ± 0.01 | 0.48 ± 0.01 | 0.24 ± 0.01 | 0.23 ± 0.01 |
|  | R2 | 0.82 ± 0.01 | 0.80 ± 0.01 | 0.81 ± 0.02 | 0.29 ± 0.03 | 0.83 ± 0.01 | 0.84 ± 0.01 |
log10_θ | MAE | 0.06 ± 0.01 | 0.07 ± 0.01 | 0.06 ± 0.01 | 0.13 ± 0.01 | 0.07 ± 0.01 | 0.06 ± 0.01 |
|  | R2 | 0.81 ± 0.02 | 0.82 ± 0.03 | 0.84 ± 0.02 | 0.34 ± 0.05 | 0.85 ± 0.03 | 0.88 ± 0.02 |
log10_α | MAE | 0.07 ± 0.01 | 0.07 ± 0.01 | 0.07 ± 0.01 | 0.15 ± 0.01 | 0.07 ± 0.01 | 0.06 ± 0.01 |
|  | R2 | 0.78 ± 0.03 | 0.81 ± 0.02 | 0.81 ± 0.02 | 0.19 ± 0.02 | 0.87 ± 0.03 | 0.90 ± 0.01 |