Table 1 Benchmarking six different embedding models on six regression property prediction tasks with MAEs and R2-scores

From: Leveraging language representation for materials exploration and discovery

Composition embedding

Structure embedding

Property

Metric

Mat2Vec (Baseline)

MatSciBERT

MatBERT

Fingerprint (Baseline)

MatSciBERT

MatBERT

E/atom

MAE

0.47 ± 0.02

0.42 ± 0.01

0.37 ± 0.01

1.13 ± 0.02

0.32 ± 0.02

0.29 ± 0.03

 

R2

0.81 ± 0.02

0.86 ± 0.01

0.88 ± 0.01

0.283 ± 0.02

0.95 ± 0.01

0.96 ± 0.01

Eg

MAE

0.15 ± 0.01

0.20 ± 0.02

0.19 ± 0.01

0.54 ± 0.03

0.25 ± 0.01

0.23 ± 0.01

 

R2

0.92 ± 0.02

0.88 ± 0.02

0.88 ± 0.01

0.45 ± 0.04

0.88 ± 0.01

0.89 ± 0.01

log_K

MAE

0.18 ± 0.01

0.18 ± 0.01

0.17 ± 0.01

0.45 ± 0.01

0.16 ± 0.01

0.15 ± 0.01

 

R2

0.83 ± 0.01

0.83 ± 0.03

0.85 ± 0.02

0.26 ± 0.02

0.90 ± 0.01

0.93 ± 0.01

log_G

MAE

0.20 ± 0.01

0.23 ± 0.01

0.22 ± 0.01

0.48 ± 0.01

0.24 ± 0.01

0.23 ± 0.01

 

R2

0.82 ± 0.01

0.80 ± 0.01

0.81 ± 0.02

0.29 ± 0.03

0.83 ± 0.01

0.84 ± 0.01

log10_θ

MAE

0.06 ± 0.01

0.07 ± 0.01

0.06 ± 0.01

0.13 ± 0.01

0.07 ± 0.01

0.06 ± 0.01

 

R2

0.81 ± 0.02

0.82 ± 0.03

0.84 ± 0.02

0.34 ± 0.05

0.85 ± 0.03

0.88 ± 0.02

log10_α

MAE

0.07 ± 0.01

0.07 ± 0.01

0.07 ± 0.01

0.15 ± 0.01

0.07 ± 0.01

0.06 ± 0.01

 

R2

0.78 ± 0.03

0.81 ± 0.02

0.81 ± 0.02

0.19 ± 0.02

0.87 ± 0.03

0.90 ± 0.01

  1. E/atom Energy per atom (eV), Eg Band gap (eV), K Bulk modulus (GPa), G Shear modulus (GPa), θ Debye temperature (K), α Coefficient of thermal expansion (K−1).