Fig. 4: Evaluation of Curie temperature prediction models on original and stratified-balanced datasets.
From: The northeast materials database for magnetic materials

a–c Predicted versus actual Curie temperatures for the test set of the original dataset using three models: Random Forest (a), Ensemble Neural Network (b), and eXtreme Gradient Boosting (XGBoost) (c). d–f Corresponding predictions on a balanced dataset created using stratified undersampling. To construct this dataset, the Curie temperature range was divided into bins, and samples from the overrepresented low-temperature region were undersampled to create a more uniform temperature distribution. Each plot reports the coefficient of determination (R2), mean absolute error (MAE), and root mean squared error (RMSE), and includes confidence intervals based on ensemble standard deviation. g–i Absolute error distributions and fitted exponential curves for the balanced dataset across the three models: Random Forest (g), Ensemble Neural Network (h) and XGBoost (i). j–l Feature importance plots showing the top 20 most influential features for models trained on the balanced dataset: Random Forest (j) Ensemble Neural Network (k), and XGBoost (l). All models were trained on features derived from the chemical composition of materials in the NEMAD database. The figure provides a comparative assessment of model performance, uncertainty, and feature relevance under both original and balanced training conditions.