Interpretable ensemble learning for materials property prediction with classical interatomic potentials

Jiang, Xinyu; Sun, Haofan; Choudhary, Kamal; Zhuang, Houlong; Nian, Qiong

doi:10.1038/s41524-024-01468-3

Download PDF

Article
Open access
Published: 24 October 2025

Interpretable ensemble learning for materials property prediction with classical interatomic potentials

npj Computational Materials volume 11, Article number: 319 (2025) Cite this article

2228 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Machine learning (ML) is widely used to explore crystal materials and predict their properties. However, the training is time-consuming for deep-learning models, and the regression process is a black box that is hard to interpret. Also, the preprocess to transfer a crystal structure into the input of ML, called descriptor, needs to be designed carefully. To efficiently predict important properties of materials, we propose an approach based on ensemble learning consisting of regression trees to predict formation energy and elastic constants based on small-size datasets of carbon allotropes as an example. Without using any descriptor, the inputs are the properties calculated by molecular dynamics with nine different classical interatomic potentials. Overall, the results from ensemble learning are more accurate than those from classical interatomic potentials, and ensemble learning can capture the relatively accurate properties from the nine classical potentials as criteria for predicting the final properties.

Introduction

In recent years, density functional theory (DFT) and molecular dynamics (MD) simulations have been studied and applied extensively in materials multiscale modeling¹. For example, the calculations of energy and forces of materials across different scales have been achieved by using these simulations^2,3,4. Current widely used simulation methods including Kohn-Sham density functional theory (KSDFT)^5,6 and MD simulations with classical interatomic potentials^{7,8,9,10,11,12,13,14,15,16} have demonstrated high performance in predicting the formation energy and elastic modulus of materials. However, both methods have their own limitations. KSDFT is computationally demanding and typically restricted to systems containing only a few hundred of atoms, while MD can be used in larger systems but is limited in accuracy due to the empirical nature of interatomic potentials.

To solve the limitations of KSDFT and MD, machine learning (ML) models^17,18 such as neural network potential (NNP)^19,20, Gaussian approximation potential²¹, spectral neighbor analysis potential^22,23, and moment tensor potential²⁴ have been proposed to accurately predict energy and forces of crystals and molecules. They use atomic species and nuclear coordinates to build descriptors (also called “fingerprints”), which are invariant under permutations among the same elements, and isometric transformations of rotations, as features to be fitted by a chosen regression model^19,25. However, these descriptors need to be designed meticulously to satisfy the restrictions and the complex transformations thus making it difficult to explain the models^26,27.

To get more generalized descriptors, graph networks, which represent atoms and bonds as nodes and edges, respectively, combined with convolutional neural networks have received significant attention, since convolutional neural networks can automatically find the important features compared to descriptor-based models²⁸. Several graph convolutional neural networks such as generalized crystal graph convolutional neural networks (CGCNN)²⁶, SchNet²⁹, MEGNet³⁰ and atomistic line graph neural network (ALIGNN)³¹ have been proposed. They are straightforward to be adopted and suitable for both crystals and molecules, however, these descriptors have complex configurations that contain a series of operators and hidden layers, and their fitting process is time-consuming due to the high data requirements and the regression function contains a large number of parameters to be fitted in the neural network³².

Compared to graph-network-based potentials, symbolic regression is a faster method to build interatomic potentials by using genetic programming to find a function that accurately expresses interatomic potentials from a set of variables and mathematical operators^33,34,35,36. But symbolic regression also has some limitations. For example, the expressions in the hypothesis space must be simple and have a significant effect on potential energy, and this model cannot learn complex terms that involve bond angles.

Besides, for general ML potentials, transferability, which describes the ability of a model to correctly predict the property of an atomic configuration lying outside the training dataset, is limited. Consequently, physically informed neural networks (PINN) are proposed to improve the transferability of unknown structures^37,38,39 by combining a general physics-based interatomic potential with a neural-network regression. PINN achieves this by optimizing a set of physical-meaning parameters of a physics-based interatomic potential from trained neural networks, and then feeding them back to improve the accuracy of the original physics-based interatomic potential. However, this method encounters a similar obstacle to the graph networks mentioned above, which is the time-consuming fitting process in the neural network resulting from a large size of data and numerous parameters.

As the molecular-dynamics simulation database gradually improves, such as the force-field database of NIST JARVIS which contains properties like formation energy and elastic constants calculated by different classical potentials, the force-field database can be the potential input for machine-learning model^40,41. In this study, we present a regression-trees-based ensemble learning approach that efficiently predicts the formation energy and elastic constants of carbon allotropes with a small size of data calculated by classical potentials. We use carbon allotropes as an example to evaluate the performance of our model since carbon is one of the fundamental elements on Earth⁴² these carbon allotropes have a variety of physical properties while being applied widely in cutting and polishing tools⁴³, superlubricity⁴⁴, solar thermal energy storage⁴⁵, etc. Therefore, understanding of the physical properties of carbon allotropes plays a significant role in both scientific research and engineering applications. We begin by extracting the structures of carbon allotropes from the Materials Project (MP)⁴⁶, and compute their formation energy and elastic constants using MD simulations with nine different classical interatomic potentials via the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)⁴⁷. Then use these computed properties as features and corresponding DFT references as targets to train and test four different ensemble learning models^48,49,50,51 consisting of regression trees⁵². In general, the performance of ensemble learning models is better than that of nine classical potentials, and based on feature importance, ensemble learning can find the accurate features and use them to improve the precision of prediction.

Results

Ensemble learning framework for properties prediction of carbon materials

Figure 1 illustrates the schematic of the ensemble learning framework. Firstly, carbon structures are extracted from the MP database. Then, the formation energy and elastic constants of each structure are calculated by MD with nine classical interatomic potentials, including the analytic bond-order potential (ABOP)⁵³, adaptive intermolecular reactive empirical bond order potential (AIREBO)⁹, standard Lennard-Jones potential (LJ)¹³, AIREBO-M¹⁰ potential that replaces the LJ term with a Morse potential in AIREBO potential⁵⁴, environment-dependent interaction potential (EDIP)⁵³, long-range carbon bond order potential (LCBOP)¹², modified embedded atom method (MEAM)¹⁴, reactive force field potential (ReaxFF)¹⁵, and Tersoff potential⁵⁵. The training dataset is composed of these properties and corresponding DFT reference collected from the MP database, and encoded into feature vectors x_i and target vectors y_i, respectively. For the formation energy, 58 carbon structures and their DFT reference are extracted from the MP database, and for the elastic constants, 20 out of 58 carbon structures’ DFT reference are used due to the absence of DFT reference and removal of unstable or erroneous calculations⁵⁶. Next, the regression-trees-based ensemble models will be trained based on these vectors. We select regression-trees-based ensemble learning models for the following reasons. First, compared to neural networks, regression trees are white-box models which make the models and outputs easy to understand and interpret. Second, as non-linear models, regression trees have better performance than classical linear regression and neural networks methods when dealing with small-size data and highly non-linear features. Third, to mitigate locally optimal decisions of regression trees, ensemble learning combined with the predictions of several regression trees will be deployed to improve robustness over a single regression tree. Last, for multi-target problems such as the prediction of elastic constants, the ensemble learning method can learn the correlations of elastic constants and output multiple targets at once. Here, different regression-trees-based ensemble-learning methods implemented in the Scikit-Learn package⁵⁷, including bootstrap aggregation (bagging)⁵⁸ and boosting⁴⁹, are used to build simple, fast, and interpretable models. Details of the architectures and methods of regression trees and ensemble learning are given in the “Methods” section. For the new carbon structures, we can calculate the same properties by using MD with the nine potentials and feed these calculated properties into the trained model with the smallest mean absolute error (MAE) during testing, as shown in Eq. 1, to predict the properties of new structures.

$${MAE}=\frac{{\sum }_{i=1}^{n}|{y}_{i}^{{pre}}-{y}_{i}|}{n}$$

(1)

Where y_i^pre is prediction of model, y_i is value of reference, and n is the number of samples.

**Fig. 1: Ensemble learning framework for properties prediction of carbon materials.**

For the formation energy of carbon materials, we employ four different ensemble-learning methods, namely RandomForest (RF), AdaBoost (AB), GradientBoosting (GB), and XGBoost (XGB), to evaluate their performance. Grid search in combination with 10-fold cross-validation is applied to optimize hyperparameters. After tuning, we run 10-fold cross-validation twenty times for each method with the optimized hyperparameters and calculate the MAEs compared to DFT reference. Furthermore, the median absolute deviation (MAD) of each method is calculated. MAD is defined as the median of the absolute deviations from the median of the residuals, as shown in Eq. 2 and Eq. 3.

$${MAD}={median}(\left|{r}_{i}-\widetilde{r}\right|)$$

(2)

Where r_i is residual between ith target’s prediction and its’ corresponding target. $\widetilde{r}$ represents residuals’ median.

$$\widetilde{r}={median}(r)$$

(3)

MAD represents the dispersion of residuals. It is more robust than MAE since it can ignore the influence of outliers. The MAE and MAD for each method are depicted in Fig. 2. Here, the voting regressor (VR) which combined RF, AB, and GB models is utilized to mitigate the overall error by averaging the predictions. Besides, Gaussian process (GP)⁵⁹, as a generic supervised learning method to solve regression problems, is also evaluated. Overall, ensemble-learning models have better performance than that of classical interatomic potentials and GP model. Notably, all these MAEs are lower than the most accurate classical potential, LCBOP. Since the formation energy values calculated by different classical potentials have high non-linear and complex relationships, under this condition, the regression trees have better performance than classical regression methods such as GP. In the inset of Fig. 2, the formation energy of various structures predicted by RF, LCBOP, and DFT are illustrated. It can be observed that RF outperforms LCBOP in terms of overall error. However, for the structures with the highest formation energy, RF’s predictions are less accurate than those by LCBOP may be due to the inherent nature of weak extrapolation that causes the lower formation energy of mp-998866. It is worth noting that RF has weak performance for mp-1008395, mp-570002, mp-624889, and mp-1018088. The reason is the deviation of the features with respect to DFT for each structure. For mp-1018088, except for the feature calculated by LJ potential, the features calculated by the other potentials are smaller than the reference, leading to an underestimated prediction. The values of mp-624889 have a distribution similar to mp-1018088, but the deviation is smaller (−0.33 eV/atom for mp-624889 and −0.55 eV/atom for mp-1018088) that makes the prediction more accurate. Conversely, since the values of mp-1008395 and mp-570002 calculated by the most classical potentials are larger than the reference, the predictions of both are overestimated. Nevertheless, RF provides more accurate predictions in general.

**Fig. 2: MAEs and MADs of the formation energy relative to DFT reference under different methods.**

For the elastic constants of carbon materials, we also train and test RF, AB, GB, and XGB models using elastic constants calculated by the same nine classical potentials. Grid search in combination with 5-fold cross-validation is applied to tune the hyperparameters of each model. Then the 10-fold cross-validation is conducted on the models with optimized hyperparameters to evaluate the performance of the models. The prediction of elastic constants is a multi-target problem. However, AB, GB, and XGB don’t support multi-target regression. To overcome this limitation, we use a multi-target regressor⁵⁷ combined with all four ensemble methods to predict elastic constants. In brief, the multi-target regressor fits one regressor based on ensemble methods per elastic constant, it is a simple strategy to extend the regressors which don’t support multi-target problems. Here, we use the Tersoff potential as a benchmark to compare with different ensemble methods since the MAE of the elastic constants of the Tersoff potential is at least an order of magnitude smaller than those of other classical potentials. Fig. 3a illustrates the MAEs of the total elastic constants of four ensemble methods conducted by twenty times 10-fold cross-validation individually. The MAEs of AB, RF, XGB, and GB are much smaller than that of Tersoff. It is worth noting that the MAE of Tersoff is significantly increased by one structure (mp-1095534) that includes both sp and sp³ hybridization of carbon, making the structure more complex than others such as diamond or graphite, which only have single hybridization. This complexity makes it difficult for Tersoff to accurately calculate. If the error associated with the mp-1095534 is removed, the MAE of Tersoff drops to 63 GPa, which is still larger than those of RF, AB, GB, and XGB. Notably, different potentials behave differently for different carbon structures, hence, to get a minimal MAE of all structures from classical potentials, we extract the smallest error among nine classical potentials with respect to DFT reference for each structure, and then calculate the total MAE of these smallest errors. As shown in Fig. 3a, the Min represents the best performance by using these classical potentials. We can see that AB has a smaller MAE than Min, and XGB performs similarly to Min. Fig. 3b shows the elastic constants calculated by Tersoff and predicted by AB and RF by using 10-fold cross-validation. The black dashed lines are the ideal fit (1:1). To fit the plot, fifteen points of Tersoff with excessively large MAEs are removed. Both AB and RF have lower MAEs than Tersoff, as shown in Fig. 3a. The AB has a lower MAE than RF, possibly due to the elastic constants of similar structures correlated with each other, and a sequential process like AB can reduce the bias. The points in the green circles in Fig. 3b have large errors, and all these points come from C₁₁, C₂₂ or C₃₃ of the structures that have complicated structures not trained in the models. These structures have smaller values in C₁₁, C₂₂ or C₃₃ compared to most of the training data. Therefore, the models don’t have enough fitted regressors for these structures, resulting in inaccurate predictions. To further assess the performance of ensemble methods compared to Tersoff, the MAEs and MADs of partial elastic constants for AB and Tersoff are calculated in Table 1. We exclude the MAEs and MADs of the remaining elastic constants due to their negligible errors. In general, all nine smallest MAEs and MADs are obtained from AB. For Tersoff with mp-1095534, the MAEs and MADs are higher than the others. Although Tersoff without mp-1095534 yields smaller MAEs and MADs than that with mp-1095534, the MAEs and MADs of AB are still smaller, some are even over 50% lower than those of Tersoff without mp-1095534. Both tables demonstrate that the ensemble learning has better performance than Tersoff.

**Fig. 3: Performance of different classical potentials and ML methods in elastic constants prediction.**

Table 1 MAEs and MADs of the partial elastic constants relative to DFT reference under AB and Tersoff

Full size table

Besides, we combine formation energy and elastic constants data together to train and test the same four ensemble methods. The MAEs for both properties are shown in Fig. 4. All models perform weaker than before, with some MAEs even three times larger than when only one property is predicted. Even so, the MAEs of formation energy in RF, AB and GB models are lower than that calculated by over half of classical potentials, and the MAEs of formation energy in XGB are similar to AIREBO-M. Additionally, the MAEs of elastic constants in all models are lower than that calculated by all classical potentials, including Tersoff, though the MAE of Tersoff is lower if it removes the residuals of mp-1095534. The main reason for the increase in errors is the large size of the feature and the complexity of the correlation between features and targets will make regression trees hard to learn the relationship between features and targets correctly, the limited samples also prevent the models from learning the features of complex structures well. In our dataset, most structures are either graphitic or diamond-like. Graphite-like structures typically have anisotropic C₁₁, C₂₂, and C₃₃ elastic constants, with two of them usually close to 900 GPa, and their formation energy is lower than that of diamond-like structures, which is around 0.05 eV. On the other hand, the C₁₁, C₂₂, and C₃₃ are isotropic in diamond-like structures and all of them are around 1100 GPa, which are higher than those of graphitic structures, and have a higher formation energy of around 0.15 eV. Given these connections between formation energy and elastic constants, we can only use one type of property as a feature to reduce the dimension of the feature and predict both properties. For instance, when only applying the dataset of formation energy as features to train and predict both formation energy and elastic constants, we find that the MAEs are similar to those shown in Fig. 4.

**Fig. 4: MAEs of the formation energy and total elastic constants relative to corresponding DFT reference under different ensemble methods.**

Interpretability

To reveal the correlation or any useful information behind these features, principal component analysis (PCA) is used to decompose the high-dimensional dataset into a set of orthogonal components and project the dataset onto these components to indicate the maximum variance. Figure 5a shows the projection of the 9-dimensional formation energy on a 2D plane, the graphite-like structures are grouped on the left of the plot, the diamond-like structures followed by fullerene-like structures are clustered on the right of graphite-like structures relative to the first principal component. And for the others, they are located more scattered and on the right in general. This distribution is consistent with the formation energy of these structures that the graphite-like structures have the smallest formation energy followed by diamond-like and fullerene-like structures, and the complex structures have relatively higher formation energy. If combined with the second principal component, similar structures are close to each other, some of them are far away from their clusters due to their higher formation energies, all of these indicate that the feature space contains the corresponding physical meaning that is consistent with that of the target property. Fig. 5b shows the PCA of the representations after the first hidden layer in CGCNN with the same structures, likewise, the graphite-like, diamond-like, and fullerene-like structures are clustered and more compact overall, and for those points that are far away from their similar structures in Fig. 5a, they are also far away from their clusters in Fig. 5b. For the high-energy structures, three similar structures are closer to each other than the other one. Especially, one of the middle-energy structures, even though there are other structures with similar energies, is far from others in both figures, indicating the features of structures with similar energies may be different due to the different structures. Since each feature vector is composed by the calculations of the classical potentials, the correlation between these features and each feature itself for similar structures leads to the similarity of the features of similar structures, while the features of dissimilar structures differ depending on the values of each feature in them, even if the energies of these structures are similar. Therefore, different from directly using structural information as representation, these special features based on energy indirectly reflect the correlation between structures. To further demonstrate that this kind of feature vector can distinguish different structures with similar energies, Fig. 5c shows the features’ values of different structures. It can be clearly seen that among the three structures graphite-like, diamond-like, and fullerene-like, the features of the same type of structures are similar, while the features of different types of structures are different. For the outliers in these three types of structures, each outlier has different features from its own cluster, indicating the difference in their structures. Similarly, for the high-energy structure, the features of the first three are similar, while the other ones are different. In the middle-energy structures, the overall features of the second-to-last structure are different from other structures due to its high EDIP and MEAM calculated values and low Tersoff value, which causes it to be far away from other points. The second structure in middle-energy structures is similar to the fullerene-like in terms of the relative relationship between each feature, which makes it close to the fullerene-like structure in Fig. 5a and it also can be seen in Fig. 5b. This shows that these features contain the correlation between structures to a certain extent. However, the correlation between similar structures may be influenced by individual changed feature values, such as some graphite-like structures being separated from their cluster in Fig. 5a due to the underestimated calculations from LJ potential than others, it can be improved by removing the unstable features.

Besides, the criteria for node splitting in regression trees is mainly based on the loss function like mean squared error (MSE), which describes the distribution of the targets under different features, and the regression trees will identify the feature with the minimal MSE as the threshold for the split point. Since there is a correlation between the features and targets in this study, it is ideal that the feature and corresponding targets have a linear relationship. So, the accuracy of the features varies for different structures, leading to different levels of linearity. Regression trees can evaluate the linear relationship of each feature at each split point and capture the most important feature that has minimal MSE. If a feature has relatively weak performance, it will have a nonlinear relationship between its values and targets. Thus, the targets corresponding to any two adjacent sorted values in this feature will become far apart, which causes the MSE to be larger compared to the more accurate feature. Table 2 shows the average feature importance of the regression tree fitted by formation energy, where 20 times permutation importance is employed for feature evaluation⁴⁸. The permutation feature importance calculates the difference of error before and after permutation of the values of the features. In Table 2, ReaxFF has the most impact on the accuracy of the model, and AIREBO-M follows ReaxFF in three models. LJ and MEAM potentials have the smallest impact due to their high deviations. It should be noticed that LCBOP has a smaller MAE than ReaxFF and AIREBO-M in Fig. 2, this is because the node splitting depends on the linear relationship between features and targets instead of the difference between feature and target. Therefore, the Pearson correlation coefficient (PCC) is also used to assess the feature importance. PCC can measure linear correlation between two sets of data. The equation of PCC is as follows,

$$r=\frac{\sum ({x}_{i}-\bar{x})({y}_{i}-\bar{y})}{\sqrt{\sum {({x}_{i}-\bar{x})}^{2}\sum {({y}_{i}-\bar{y})}^{2}}}$$

(4)

Where x_i and y_i are values of the x and y variables, respectively, and x̅ and y̅ are their means. Fig. 5d is PCC of features and reference. From the last column of PCC, we can see ReaxFF has the largest positive linear correlation with the reference, and the regression tree will capture this linear correlation and use ReaxFF to split nodes, which indicates ReaxFF is the most important feature. The high positive linear correlation also explains why ReaxFF is more important than LCBOP although LCBOP has a smaller MAE in Fig. 2. However, the LCBOP’s correlation with reference is higher than that of AIREBO-M, yet its importance is lower than that of AIREBO-M in Fig. 6b. This suggests that other factors may also play a role in determining feature importance.

Table 2 Average feature importance is calculated by permutating features

Full size table

**Fig. 6: Frequency of local optimization occurred in each feature.**

Except for the factors mentioned before, the loss function indicates that local minimal error also impacts the choice of features for splitting. The regression tree algorithm is inherently greedy, aiming to find the feature with the local minimal error as a splitting feature if the sorted target values of the samples are close to each other. To quantify the level of local minimal error of each feature, we propose a way to describe the frequency of its occurrence. Figure 6a illustrates the process of computing the local minimal level of each feature. In the beginning, initialize the local minimal level of each feature to 0, and split each feature’s vector and target’s vector into small parts, each part of the target consists of adjacent sorted target values and each part of the feature value is the corresponding target values after sorting according to the MD-calculated values. For each part of the feature, we find its minimum and maximum values and compare them to each part of the target. If both minimum and maximum values exist in the target’s part, which indicates a local minimal error and adds 1 to this feature’s local minimal level. Figure 6b shows the local minimal level of each feature, where ReaxFF has the largest local minimal level, followed by AIREBO-M. Thus, the local minimal level may explain a certain degree of the feature importance of ReaxFF and AIREBO-M in Table 2. In addition, the PCC, MAE, feature importance, and local minimum frequency of RF trained by carbon materials’ formation energy are analyzed to illustrate that under different structures, the ensemble learning likely to use the more accurate potential as criteria to output. As shown in Table 3, the highest PCC, lowest MAE, largest Importance, and largest frequency are bolded. It can be seen that under different energy interval, the accuracy of each potential is different, the PCC, MAE, feature importance, and local minimum frequency of each feature in ensemble models are positively correlated to each other in general. The ensemble learning splits the node based on these indicators and normally uses the more accurate potentials’ properties as criteria to output, although the ReaxFF has the largest PCC and importance overall, ensemble learning can generally utilize the more accurate feature as criteria for predicting under the corresponding structures, for the structures with high formation energy, LCBOP is not the most important feature may be due to the lack of training data in which only 5 samples, the importance of LCBOP is still the second large. this characteristic is based on the local minimum algorithm which can capture relatively accurate features for splitting the tree’s node.

Table 3 The PCC, MAE, feature importance, and local minimum frequency of RF trained by carbon materials under different formation energy intervals

Full size table

In summary, the MAE, PCC, feature importance, and local minimum frequency of each feature in ensemble models are correlated, and positively correlated to each other in general. The feature with the highest importance also generally owns the smallest MAE, highest PCC, and highest local minimum frequency, indicating the relationship between the feature and target, and the algorithm of the decision tree are correlated to each other and will decide the feature importance of decision trees. So, based on the linear correlation between features and targets and the local greedy characteristic of algorithms, ensemble learning can capture relatively accurate features calculated by the nine classical potentials under the corresponding structure for splitting the node. More accurate features can be used to improve the ensemble model’s performance systematically.

Formation energy prediction of new carbon structures

To further evaluate the performance, the trained formation energy RF model is employed to predict the formation energy of new carbon structures. We extract all structures of both silicon carbide and silicon from the MP database and compared them with carbon structures by using the similarity method⁶⁰. This method evaluates the dissimilarity of any two structures by calculating the statistical difference in local coordination environment of all sites in both structures. Out of 76 silicon carbide and silicon structures, 10 are extracted as new structures based on the similarity method. We replace the silicon element with the carbon element in these 10 structures to get new carbon structures and calculate their formation energy with the nine classical potentials as features. These features are then input into the pre-trained RF model to predict formation energy. Fig. 7 illustrates the formation energy of each new structure calculated by RF, CGCNN, ALIGNN, DFT, and 3 most accurate classical potentials. The MAEs of CGCNN, ALIGNN, and RF are 0.376 eV/atom, 0.446 eV/atom, and 0.850 eV/atom, respectively, while the minimum MAE in classical potentials is 1.402 eV/atom for AIREBO. The CGCNN and ALIGNN models use interatomic structural information as input, which enables them to have certain transferability when facing structures outside the training set. Different from these two models, the RF model based on the feature values instead of atomic structures, although RF performs well in interpolation, its extrapolation ability is limited for new structures with energies higher than those in the training set. This can be seen from the fact that the high energy prediction value approaches a certain value in Fig. 7. Since only 4 carbon structures in the training dataset have formation energy larger than 2 eV/atom, and the highest energy is 2.7 eV/atom, its maximum prediction value is around 2.7 eV/atom, and it cannot make reasonable predictions for structures higher than 2.7 eV/atom. In addition, for new structures whose energies are within the range of the training set, RF’s predictions depend on the accuracy of the features. In other words, for the lowest energy structure in the graph, since all features are higher than those calculated by DFT, RF will over-predict like the classical potential. Therefore, the performance of RF depends on the diversity of the training set and the transferability of the features. To further inspect the relationship between features and model, the MAE of features corresponding to DFT, and feature importance of RF are calculated, as shown in Table 4. Interestingly, the feature importance changes with the accuracy of the features for the new structure. The AIREBO-M has the smallest MAE and has the largest importance, but the ReaxFF has a large MAE and its importance is low. This may indicate that the RF can filter out the features with large errors according to the trained feature values, so as to split the trees by the features within a reasonable range.

**Fig. 7: Performance of different methods for prediction of formation energy of some carbon structures.**

Table 4 The MAE of features corresponding to DFT, and feature importance of RF

Full size table

Discussion

There are some limitations and opportunities for improvement. First, the limited size of the training dataset may restrict the performance of the models. This constraint is acknowledged in Figs. 3b and 7, and it could be mitigated by including more training samples to extend learning space or to make the interpolate prediction smoother. Due to the limited carbon structures in MP, the Si-O binary systems are used to test the performance of ensemble learning in larger datasets and evaluate the influence of training data size. Here, 335 Si-O structures are extracted from MP and three classical potentials, COMB⁶¹, Tersoff⁶², and Vashishta⁶³, calculate their formation energies. Figure 8 shows the performance of different models under different k-fold cross-validation, as the k increases, the training data becomes more and the error decreases. Finally, the error tends to be stable when a certain amount of training data is reached, indicating that under-fitting will lead to prediction errors when there is insufficient training data set. When the number of training sets reaches a certain level, the very similar training data won’t help the model to improve the performance, since the model has learned enough feature information based on the previous training set to predict new structures. Although for the extrapolate new structures is limited, such as in Figure 7, the overall errors in all ensemble models for 10-fold cross-validation (0.132 eV/atom, 0.143 eV/atom, 0.140 eV/atom, and 0.141 eV/atom for RF, AB, GB, and XGB, respectively) are smaller than that of three potentials (0.240 eV/atom, 0.156 eV/atom, and 0.147 eV/atom for COMB, Tersoff, and Vashishta, respectively). Figure 9 illustrates the formation energy of each Si-O structure calculated by RF, CGCNN, ALIGNN, DFT, and 3 classical potentials in a logarithmic scale, negative values in CGCNN and ALIGNN are taken as absolute values for plotting. For low-energy structures (the first 250 structures), compared with DFT calculations, the CGCNN, ALIGNN, COMB, and Tersoff have higher deviations than those of Vashishta and RF in general. The COMB and ALIGNN have higher predictions than the DFT, and the Tersoff and CGCNN are vice versa. For high-energy structures (the remaining 85 structures), however, Vashishta’s predictions are generally lower than DFT values, and the other two classical potentials are more accurate than Vashishta’s. The RF and ALIGNN have smaller deviations than the CGCNN. The overall MAEs of RF, ALIGNN, CGCNN, Vashishta, Tersoff, and COMB are 0.132 eV/atom, 0.106 eV/atom, 0.146 eV/atom, 0.147 eV/atom, 0.156 eV/atom, and 0.240 eV/atom, respectively. Briefly, the ML-based models have lower MAEs than that of classical potentials and the RF has the lowest overall error besides the ALIGNN. Even though the ALIGNN has a larger deviation in the low-energy region than the RF on a logarithmic scale, the energy difference between DFT and ALIGNN is small on a linear scale, and the RF has some points that have relatively large deviations in both low-energy and high-energy regions, the ALIGNN, however, has the lowest MAE for predictions of high-energy structures. To interpret the RF, the PCC, MAE, local minimum frequency and feature importance between different potentials in the low-energy and high-energy regions are calculated as well. As shown in the Table 5, for the low-energy region, Vashishta has the smallest MAE, the largest PCC, the largest feature importance, and largest local minimum frequency, while Tersoff is in the high-energy region. These results are generally consistent with those in Table 3, also indicating that ensemble learning can find more accurate potential energy calculations under the corresponding structure as features for prediction.

Table 5 The MAE, PCC, feature importance and local minimum frequency of different features

Full size table

Second, the performance of the regression trees is related to the accuracy of the features and a linear correlation between features and targets. So, more accurate classical interatomic potentials may be used as features to improve the performance. In addition, more features make ensemble methods more complex and harder to interpret the feature importance, and complex correlations between features and targets also may lead to regression trees being unstable which affects the performance of ensemble learning, like the overall performance of the two properties prediction model (Fig. 4) is worse than single property prediction model (Figs. 2 and 3a), so to get better performance and interpretability, single property prediction model and appropriate feature size need to be considered. Table 6 shows the performance of RF for predicting the formation energy of carbon materials with different feature sizes. Compared with the MAE of MD simulation with a single classical potential (Fig. 2), the RF characterized by one feature calculated with a single potential performs better, but not as well as the RF characterized by all potentials. Besides, except for the RF with low-precision features (using ABOP, LJ, MEAM, Tersoff, and EDIP as features), the RFs with high-precision features (using AIREBO, AIREBO-M, LCBOP, and ReaxFF as features) and using only accurate LCBOP and ReaxFF as features perform better than the RF with only a single feature. In particular, the RF performs best when only the highest-precision potentials LCBOP and ReaxFF are used as features, and these results also show that when the number of features increases, especially when inaccurate feature values are added, the accuracy of the model will decrease due to the more complex feature relationship, on the contrary, when only the accurate features are used, the correlation between features and targets are more linear, make the regression tree easier to find the intrinsic correlation between feature and target.

Table 6 The MAE of RF trained by different features size

Full size table

Except for the discussions above, under a certain feature size, the input feature composed of physical properties calculated by different classical interatomic potentials is not convenient to get since it needs to calculate each new structure’s physical properties by these potentials. Inspired by the imputation of missing input values, this dilemma can be relieved by utilizing imputation methods to infer the missing values from the known part of the data. Here, we use k-Nearest Neighbors (KNN) approach⁶⁴ to impute the missing features in the input. KNN based on Euclidean distance metric to learn the correlation between features and find the nearest neighbors of the missing values among the samples that have values for the features, and the missing values are imputed using average weighted values from the nearest neighbors. Figure 10 shows under the 10-fold cross-validation of the formation energy dataset, the performance of the RF model combined with 2-nearest neighbors’ imputation when only one or two features use MD calculation. It can be clearly seen that when the more accurate features are calculated and other features are imputed, the accuracy of the model is higher. This is because the more accurate features are more important in the model, so calculating these features instead of imputing them will reduce the deviations of these features, thereby improving the prediction stability of the model. It can also be found from Fig. 10 that if more feature values are calculated as input, the accuracy of the model will be higher, such as when ReaxFF and AIREBO-M are used as input, the MAE is smaller than others, and the accuracy of the model is similar to that of the GB full-input model (Fig. 2). So, it is feasible to reduce the workload of obtaining the input part through the imputation of missing data though it will increase the error to a certain extent. Finally, it is worth mentioning that there are also some questions not discussed in this paper, such as the feasibility of the ensemble learning method in MD simulation and structure optimization problems, and these questions need further research to determine whether ensemble learning can do these calculations.

**Fig. 10: The MAEs of formation energy of RF under 2-nearest neighbors’ imputation.**

In summary, we explore the possibility of prediction for the physical properties of a small size of carbon allotropes based on ensemble learning. The formation energy and elastic constants of carbon structures, as examples, can be predicted by using this kind of method. In general, the ensemble methods have better performance than the classical interatomic potentials we used in this work. Although at some points, the prediction is not accurate due to a lack of training data, the high dimensionality of features, and the local greedy characteristic of the algorithm, making the model difficult to learn the relationship between features and targets correctly. The PCA shows the input which consists of the values calculated by different classical interatomic potentials, has a similar distribution with the corresponding target physical property. What’s more, the Pearson correlation coefficient illustrates the linear correlation between input and output, and the regression trees can capture the relatively accurate feature as criteria for splitting the point in regression trees by evaluating the feature importance.

Methodology

Regression trees of ensemble learning

Regression trees, a type of decision tree, are used to predict outputs consisting of numerical values instead of categorical targets. They are also the base estimators in ensemble learning (the tree structures in Fig. 12). Figure 11 illustrates a regression tree that has seven nodes in total. The tree starts from the top node, and each node contains sorted samples and will be split into two subsets based on the criteria (thresholds) for the features until the terminal condition is reached. The blue nodes are parent nodes, they have two subsets called children. The green nodes are end nodes, representing numerical outputs that are decided by the targets. In scikit-learn, the optimized version of Classification and Regression Trees (CART)⁵² is used. This algorithm determines how to divide the sorted samples by trying different thresholds and calculating the MSE at each step. In this study, the feature vectors x_i∈Rⁿ and target vector y∈R^k are properties calculated by classical interatomic potentials and corresponding DFT reference, respectively. Where subscript i represents indexes of different materials, superscript n shows the number of input variables (the number of classical interatomic potentials), and superscript k is the total number of materials. We denote Q_m as the dataset at node m with N_m samples, ${Q}_{m}^{{left}}{and}{{Q}}_{m}^{{right}}$ as the children of Q_m, with ${N}_{m}^{{left}}$ and ${N}_{m}^{{right}}$ as the number of samples of these children. These children will split Q_m into two parts using a threshold. The quality of the split of node m is calculated by minimizing the weighted average of impurity.

$$G\left({Q}_{m}\right)=\frac{{N}_{m}^{{left}}}{{N}_{m}}H\left({Q}_{m}^{{left}}\right)+\frac{{N}_{m}^{{right}}}{{N}_{m}}H\left({Q}_{m}^{{right}}\right)$$

(5)

Where H is loss function (such as MSE), for example, at node m, the MSE of its left child ${Q}_{m}^{{left}}$ is given by:

$${\bar{y}}_{m}=\frac{1}{{N}_{m}^{{left}}}{\sum }_{y\in {Q}_{m}^{{left}}}y$$

(6)

$$H\left({Q}_{m}^{{left}}\right)=\frac{1}{{N}_{m}^{{left}}}{\sum }_{y\in {Q}_{m}^{{left}}}{(y-{\bar{y}}_{m})}^{2}$$

(7)

Here, ${\bar{y}}_{m}$ is average value of target at node of ${Q}_{m}^{{left}}$. By recursing the ${Q}_{m}^{{left}}$ ${and}{{Q}}_{m}^{{right}}$, the weighted average of impurity changes as well, and selects the threshold that minimizes the impurity $G$ as the node m. Repeat and do the same steps for each node until the terminal condition is reached, and finally, a trained regression tree is obtained.

Fig. 11: The schematic of a regression tree illustrates the blue nodes (parent nodes) are split into two subsets (children) based on the threshold for the features until the terminal condition is reached.

Bagging and boosting methods

Bagging and boosting methods shown in Fig. 12 are used to achieve better performance than a single regression tree in this work. In bagging methods, several regression trees are trained independently by their own subsets in which the data can be chosen more than once. The final prediction is obtained by averaging the predictions⁴⁸ of all individual regression trees. On the contrary, the regression trees in the boosting method are generated sequentially and each regression tree has limited depth and is related to the previous one. Instead of averaging the outputs of all regression trees, the final prediction is calculated by calculating weighted median⁴⁹ of the predictions of all regression trees or summing predictions of all regression trees up⁵⁰

Data collections

For the formation energy, 58 carbon structures and 335 Si-O structures and their DFT reference are extracted from the MP database. And use the nine classical potentials available for carbon elements in LAMMPS to calculate the energy minimization of each structure to get the formation energy per atom of each structure, then the energy above hull in the Materials Project database is used and extracted the structure with a value of 0 as a reference, and the nine-potential-calculated values under this structure are used as their respective reference to obtain the energy above hull of all structures and use these values as the input feature value. For elastic constants, 20 out of 58 carbon structures’ DFT reference are used due to the absence of DFT reference and removal of unstable or erroneous calculations⁶⁵. For the features of each structure, use the same nine potentials to calculate the elastic constants at 0 K with the LAMMPS. The 21 elastic constants calculated by each potential are used as the input features.

Data availability

All data needed to evaluate the conclusions of this study are present in the paper, and all relevant data are available from Dr. Houlong Zhuang and Dr. Qiong Nian upon reasonable request.

Code availability

All codes needed to evaluate the conclusions in the paper are available from Dr. Houlong Zhuang and Dr. Qiong Nian upon reasonable request.

References

Hafner, J. Atomic-scale computational materials science. Acta Mater. 48, 71–92 (2000).
Article CAS Google Scholar
Vakhrushev, A. V. Introductory chapter: molecular dynamics: basic tool of nanotechnology simulations for “Production 4.0” revolution. In Molecular Dynamics. IntechOpen (2018).
Hermann, A., & Kurzydłowski, D. First-principles prediction of structures and properties in crystals. Crystals, 9, 463 (2019).
Sholl, D. S. & Steckel, J. A. Density functional theory: a practical introduction. (John Wiley & Sons, 2022).
Kohn, W. & Sham, L. J. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133 (1965).
Article Google Scholar
Hohenberg, P. & Kohn, W. Inhomogeneous electron gas. Phys. Rev. 136, B864 (1964).
Article Google Scholar
Pettifor, D. & Oleinik, I. Analytic bond-order potential for open and close-packed phases. Phys. Rev. B 65, 172103 (2002).
Article Google Scholar
Murdick, D. et al. Analytic bond-order potential for the gallium arsenide system. Phys. Rev. B 73, 045206 (2006).
Article Google Scholar
Stuart, S. J., Tutein, A. B. & Harrison, J. A. A reactive potential for hydrocarbons with intermolecular interactions. J. Chem. Phys. 112, 6472–6486 (2000).
Article CAS Google Scholar
O’connor, T. C., Andzelm, J. & Robbins, M. O. AIREBO-M: a reactive model for hydrocarbons at extreme pressures. J. Chem. Phys. 142, 024903 (2015).
Article PubMed Google Scholar
Justo, J. F., Bazant, M. Z., Kaxiras, E., Bulatov, V. V. & Yip, S. Interatomic potential for silicon defects and disordered phases. Phys. Rev. B 58, 2539 (1998).
Article CAS Google Scholar
Los, J. & Fasolino, A. Intrinsic long-range bond-order potential for carbon: Performance in Monte Carlo simulations of graphitization. Phys. Rev. B 68, 024107 (2003).
Article Google Scholar
Lennard-Jones, J. E. Cohesion. Proc. Phys. Soc. 43, 461 (1931).
Article CAS Google Scholar
Baskes, M. I. Modified embedded-atom potentials for cubic materials and impurities. Phys. Rev. B 46, 2727 (1992).
Article CAS Google Scholar
Chenoweth, K., Van Duin, A. C. & Goddard, W. A. ReaxFF reactive force field for molecular dynamics simulations of hydrocarbon oxidation. J. Phys. Chem. A 112, 1040–1053 (2008).
Article CAS PubMed Google Scholar
Tersoff, J. New empirical approach for the structure and energy of covalent systems. Phys. Rev. B 37, 6991 (1988).
Article CAS Google Scholar
Vasudevan, R. K. et al. Materials science in the artificial intelligence age: high-throughput library generation, machine learning, and a pathway from correlations to the underpinning physics. MRS Commun. 9, 821–838 (2019).
Article CAS Google Scholar
Choudhary, K. et al. Recent advances and applications of deep learning methods in materials science. npj Comput. Mater. 8, 59 (2022).
Article Google Scholar
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Article PubMed Google Scholar
Behler, J. Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations. Phys. Chem. Chem. Phys. 13, 17930–17955 (2011).
Article CAS PubMed Google Scholar
Bartók, A. P., Payne, M. C., Kondor, R. & Csányi, G. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys. Rev. Lett. 104, 136403 (2010).
Article PubMed Google Scholar
Thompson, A. P., Swiler, L. P., Trott, C. R., Foiles, S. M. & Tucker, G. J. Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. J. Comput. Phys. 285, 316–330 (2015).
Article CAS Google Scholar
Wood, M. A. & Thompson, A. P. Extending the accuracy of the SNAP interatomic potential form. J. Chem. Phys. 148, 241721 (2018).
Article PubMed Google Scholar
Shapeev, A. V. Moment tensor potentials: A class of systematically improvable interatomic potentials. Multiscale Modeling Simul. 14, 1153–1173 (2016).
Article Google Scholar
Mueller, T., Hernandez, A. & Wang, C. Machine learning for interatomic potential models. J. Chem. Phys. 152, 050902 (2020).
Article CAS PubMed Google Scholar
Xie, T. & Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018).
Article CAS PubMed Google Scholar
Mishin, Y. Machine-learning interatomic potentials for materials science. Acta Mater. 214, 116980 (2021).
Article CAS Google Scholar
Liu, X., Zhang, J. & Pei, Z. Machine learning for high-entropy alloys: progress, challenges and opportunities. Progress Mater. Sci. 131, 101018 (2022).
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. Schnet–a deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Article PubMed Google Scholar
Chen, C., Ye, W., Zuo, Y., Zheng, C. & Ong, S. P. Graph networks as a universal machine learning framework for molecules and crystals. Chem. Mater. 31, 3564–3572 (2019).
Article CAS Google Scholar
Choudhary, K. & DeCost, B. Atomistic line graph neural network for improved materials property predictions. npj Comput. Mater. 7, 185 (2021).
Article Google Scholar
Fung, V., Zhang, J., Juarez, E. & Sumpter, B. G. Benchmarking graph neural networks for materials chemistry. npj Comput. Mater. 7, 84 (2021).
Article CAS Google Scholar
Slepoy, A., Peters, M. & Thompson, A. Searching for globally optimal functional forms for interatomic potentials using genetic programming with parallel tempering. J. Comput. Chem. 28, 2465–2471 (2007).
Article CAS PubMed Google Scholar
Brown, W. M., Thompson, A. P. & Schultz, P. A. Efficient hybrid evolutionary optimization of interatomic potential models. J. Chem. Phys. 132, 024108 (2010).
Article PubMed Google Scholar
Khusenov, M. A., & Kholmurodov, K. T. (2014, September). On molecular dynamics modeling of the DNA–CNT interaction processes: the diagnostic application and simulation aspects. In 7th RUSSIAN-JAPANESE INTERNATIONAL WORKSHOP MSSMBS’14 “Molecular p. 30.
Hernandez, A., Balasubramanian, A., Yuan, F., Mason, S. A. & Mueller, T. Fast, accurate, and transferable many-body interatomic potentials by symbolic regression. npj Comput. Mater. 5, 112 (2019).
Article Google Scholar
Pun, G. P., Batra, R., Ramprasad, R. & Mishin, Y. Physically informed artificial neural networks for atomistic modeling of materials. Nat. Commun. 10, 2339 (2019).
Article PubMed PubMed Central Google Scholar
Pun, G. P., Yamakov, V., Hickman, J., Glaessgen, E. & Mishin, Y. Development of a general-purpose machine-learning interatomic potential for aluminum by the physically informed neural network method. Phys. Rev. Mater. 4, 113807 (2020).
Article CAS Google Scholar
Lin, Y.-S., Pun, G. P. P. & Mishin, Y. Development of a physically-informed neural network interatomic potential for tantalum. Comput. Mater. Sci. 205, 111180 (2022).
Article CAS Google Scholar
Choudhary, K. et al. Evaluation and comparison of classical interatomic potentials through a user-friendly interactive web-interface. Sci. Data 4, 1–12 (2017).
Article Google Scholar
Choudhary, K. et al. High-throughput assessment of vacancy formation and surface energies of materials using classical force-fields. J. Phys. Condens. Matter 30, 395901 (2018).
Article PubMed Google Scholar
Andreas, H. The era of carbon allotropes. Nat. Mater. 9, 868 (2010).
Article Google Scholar
Irifune, T., Kurio, A., Sakamoto, S., Inoue, T. & Sumiya, H. Nature 421, 599–600 (2003).
Article CAS PubMed Google Scholar
Chen, X. & Li, J. Superlubricity of carbon nanostructures. Carbon 158, 1–23 (2020).
Article CAS Google Scholar
Badenhorst, H. A review of the application of carbon materials in solar thermal energy storage. Sol. Energy 192, 35–68 (2019).
Article CAS Google Scholar
Jain, A. et al. Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).
Article Google Scholar
Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19 (1995).
Article CAS Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
Drucker, H. Improving regressors using boosting techniques. In ICML 97, 115 (1997).
Google Scholar
Friedman, J. H. Stochastic gradient boosting. Computational Stat. data Anal. 38, 367–378 (2002).
Article Google Scholar
Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining pp. 785--794 (2016).
Breiman, L., Friedman, J. H., Olshen, R. A. & Stone, C. J. Classification and regression trees (Wadsworth, Belmont, CA). ISBN-13, 978-0412048418 (1984).
Zhou, X., Ward, D. K. & Foster, M. E. An analytical bond‐order potential for carbon. J. Comput. Chem. 36, 1719–1735 (2015).
Article CAS PubMed Google Scholar
Morse, P. M. Diatomic molecules according to the wave mechanics. II. Vibrational levels. Phys. Rev. 34, 57 (1929).
Article CAS Google Scholar
Kınacı, A., Haskins, J. B., Sevik, C. & Çağın, T. Thermal conductivity of BN-C nanostructures. Phys. Rev. B 86, 115410 (2012).
Article Google Scholar
De Jong, M. et al. Charting the complete elastic properties of inorganic crystalline compounds. Sci. Data 2, 1–13 (2015).
Google Scholar
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Breiman, L. Bagging predictors. Mach. Learn. 24, 123–140 (1996).
Article Google Scholar
Rasmussen, C. E. & Nickisch, H. Gaussian processes for machine learning (GPML) toolbox. J. Mach. Learn. Res. 11, 3011–3015 (2010).
Google Scholar
Pan, H. et al. Benchmarking coordination number prediction algorithms on inorganic crystal structures. Inorg. Chem. 60, 1590–1603 (2021).
Article CAS PubMed Google Scholar
Shan, T.-R., Devine, B. D., Kemper, T. W., Sinnott, S. B. & Phillpot, S. R. Charge-optimized many-body potential for the hafnium/hafnium oxide system. Phys. Rev. B 81, 125328 (2010).
Article Google Scholar
Mumetoh, S. Interatomic potential for Si-O systems using Tersoff parameterization. Comp. Mat. Sci. 39, 334–339 (2007).
Article Google Scholar
Broughton, J. Q., Meli, C. A., Vashishta, P. & Kalia, R. K. Direct atomistic simulation of quartz crystal oscillators: Bulk properties and nanoscale devices. Phys. Rev. B 56, 611 (1997).
Article CAS Google Scholar
Troyanskaya, O. et al. Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001).
Article CAS PubMed Google Scholar
Jiang, C., Morgan, D. & Szlufarska, I. Carbon tri-interstitial defect: a model for the D II center. Phys. Rev. B 86, 144118 (2012).
Article Google Scholar

Download references

Acknowledgements

Funding for this research was provided by National Science Foundation (NSF) under award numbers CMMI-1826439, CMMI-1762792 and CMMI-1825739. This support is greatly acknowledged. The authors also thank the Agave Computer Cluster of ASU for providing the computational resources.

Author information

Authors and Affiliations

School for Engineering of Matter, Transport and Energy, Arizona State University, Tempe, AZ, USA
Xinyu Jiang, Haofan Sun, Houlong Zhuang & Qiong Nian
Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
Kamal Choudhary

Authors

Xinyu Jiang
View author publications
Search author on:PubMed Google Scholar
Haofan Sun
View author publications
Search author on:PubMed Google Scholar
Kamal Choudhary
View author publications
Search author on:PubMed Google Scholar
Houlong Zhuang
View author publications
Search author on:PubMed Google Scholar
Qiong Nian
View author publications
Search author on:PubMed Google Scholar

Contributions

X.J. built and analyzed the ensemble models, and H.S. and X.J. performed the MD simulations and analysis of MD data. X.J., K.C., H.Z., and Q.N. designed the ensemble model with MD calculations as input. K.C., H.Z., and Q.N. advised on this research work. All authors contributed to analyzing the results and writing the manuscript.

Corresponding authors

Correspondence to Houlong Zhuang or Qiong Nian.

Ethics declarations

Competing interests

The authors declare no competing financial interests. K.C. is an associate editor for npj computational materials and was not involved in the editorial review or the decision to publish this article.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Jiang, X., Sun, H., Choudhary, K. et al. Interpretable ensemble learning for materials property prediction with classical interatomic potentials. npj Comput Mater 11, 319 (2025). https://doi.org/10.1038/s41524-024-01468-3

Download citation

Received: 24 July 2023
Accepted: 11 November 2024
Published: 24 October 2025
Version of record: 24 October 2025
DOI: https://doi.org/10.1038/s41524-024-01468-3

Subjects

Abstract

Introduction

Results

Ensemble learning framework for properties prediction of carbon materials

Interpretability

Formation energy prediction of new carbon structures

Discussion

Methodology

Regression trees of ensemble learning

Bagging and boosting methods

Data collections

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links