Fig. 3: Results from statistical modeling. | npj Computational Materials

Fig. 3: Results from statistical modeling.

From: Identification of high-dielectric constant compounds from statistical design

Fig. 3

a ANN model validation on a test set of 373 materials split from the training-data. We used an ensemble of ANNs to predict a distribution of values for each material. This particular model-fit plot is taken from a single ANN model that was part of the ensemble in design cycle 2. The 373 materials plotted here were not seen by this particular ANN model at any stage during the training. These predictions are made only for this particular ANN model to show its learning capabilities, and it is not part of the design workflow that we created. In the design workflow, each ANN model in the ensemble is exposed only to a unique subset of the full MP training-data, excluding 373 randomly chosen materials. Further, in the design workflow, this trained ANN model is used to predict the dielectric values of only the search-space materials from OQMD, not the 373 unseen materials from the MP dataset. The model was trained to predict \({\log }_{2}(\epsilon )\) because the ϵ values were highly non-uniform in the training-data with most of the values below 25, making some of the very large values outliers. A log-scale transformation of ϵ reduced the numerical difference between the largest ϵ value and the median, making the former less of an outlier in ANN modeling. The model fit shown in this plot has an R2 score of 70%, and a Spearman's rank correlation of 85%. b This plot shows the predicted ϵ-distributions and corresponding E(I) values on the same test dataset consisting of 373 materials split from the training-data. The error bars represent the standard deviation in ANN-ensemble predictions which is quantified as the uncertainty of ANN modeling. For a clearer perspective, the radius and color of the circles represent the same quantity—the expected improvement, E(I), value calculated using the EGO algorithm. A point without an outer circle around it represents a material with a negligible (<10−3) value for E(I). In this figure, only 25 materials have an E(I) value that is greater than 10−3.

Back to article page