Table 1 Twelve case studies.

From: AlphaMat: a material informatics hub connecting data, features, models and applications

Material property

Task

Data scale

AlphaMat performance

Ref

Ef (eV/atom)

R1

3483 Cal. points

Pc = 0.877, MAE = 0.221, RMSE = 0.341, RMSE/S.D. = 0.482

16

Eg (eV)

R2

3895 Exp points

Pc = 0.933, MAE = 0.347, RMSE = 0.522, RMSE/S.D. = 0.360

68

BASR (cm−1)

R3

1515 Cal. points

Pc = 0.878, MAE = 2.278, RMSE = 4.381, RMSE/S.D. = 0.482

69

εpoly

R4

1028 Cal. points

Pc = 0.675, MAE = 3.265, RMSE = 4.745, RMSE/S.D. = 0.765

70

K (GPa)

R5

373 Cal. points

Pc = 0.895, MAE = 15.853, RMSE = 21.248, RMSE/S.D. = 0.433

16

Ea (eV)

R6

1109 Cal. points

Pc = 0.803, MAE = 0.402, RMSE = 0.604, RMSE/S.D. = 0.596

71

κ (W m−1 K−1)

R7

128 Exp. points

Pc = 0.829, MAE = 0.293, RMSE = 0.465, RMSE/S.D. = 0.578

72

SHG responses (pm V−1)

R8

291 Cal. points

Pc = 0.854, MAE = 0.963, RMSE = 1.524, RMSE/S.D. = 0.550

73

Metal/Semiconductor

C1

6353 Exp. points

Precision = 0.93, recall = 0.93, F1-score = 0.93, AUC = 0.93

68

Ferro/Non-ferro

C2

1028 Cal. points

Precision = 0.87, recall = 0.87, F1-score = 0.87, AUC = 0.87

70

Strong/Weak ΔE

C3

65 Cal. points

Precision = 0.85, recall = 0.85, F1-score = 0.85, AUC = 0.85

66

FM/AFM

C4

220 Cal. points

Precision = 0.82, recall = 0.80, F1-score = 0.80, AUC = 0.80

18

  1. The material features as ML inputs were calculated using component descriptors, and the classification models and regression models were implemented by training XGBoost models. The performance on testing data was selected to evaluate the AlphaMat performance. It is noted that these case studies are meant to provide macro guidance for the users to facilitate a complete employment of the available materials database.
  2. R: Regression tasks.
  3. C: Classification tasks.
  4. Cal: Calculated data.
  5. Exp: Experimental data.
  6. PC: Pearson correlation coefficient.
  7. MAE: Mean absolute error.
  8. RMSE: Root mean squared error.
  9. S.D.: Standard deviation of the real data in testing set.
  10. AUC: Area under receiver operator characteristic (ROC) curve.