Extended Data Fig. 1: Active learning workflow of the machine learning potential across the full phase diagram. | Nature

Extended Data Fig. 1: Active learning workflow of the machine learning potential across the full phase diagram.

From: The first-principles phase diagram of monolayer nanoconfined water

Extended Data Fig. 1

(a) As depicted in the schematic, the main idea behind the machine learning potential is the combination of multiple neural network potentials (NNPs) in a “committee model”33, where the committee members are separately trained by random subsampling of the training set. Here we use a committee of 8 neural network potentials. The committee average provides more accurate predictions than the individual NNPs, and the committee disagreement, the standard deviation across the committee, is an estimate of the error of the model. To construct a training set of such a model in an automated and data-driven way, new configurations with the highest disagreement can be added to the training set. This approach is known as “query by committee” (QbC). (b) As shown in the schematic, the development of MLPs for various phase points is performed across different “generations”, such that in each generation new thermodynamic state points are targeted to yield an MLP which is used to sample new candidate structures for the next generation. We used the bulk water potential from ref. 33, trained on 814 configurations of liquid water, different ice phases, and the water-vacuum interface, all including nuclear quantum effects. In the next generation we added 521 new structures of classical and path integral NPT simulations of monolayer, bilayer water, bulk water and hexagonal ice (100–400 K and 0–1 GPa), ice VII and ice VIII (100–400 K and 0.8–10 GPa), and two sets of monolayer and bilayer ice structures from ref. 18. In the final generation we added 207 structures from classical and path integral NPT temperature and pressure ramps of monolayer water (100–400 K and 0–15 GPa). We obtain an energy and force root mean square error (RMSE) for the training set of 2.4 meV per H2O (or 0.2 kJ mol−1) and 75.4 meV Å−1, respectively. (c) Force (top) and energy (bottom) root mean square error (RMSE) of an independent validation set covering the explored phase diagram of mono-layer confined water. The largest force RMSE for this validation set is with 100 meV Å−1, suggesting that the model remains accurate and robust across a wide range of temperature and pressure. The new model also keeps its excellent performance for the original condensed phase conditions, as noted by a predicted density of 0.93 kg l−1 and a melting temperature of 270 ± 5 K at ambient pressure, in excellent agreement with the reference functional31.

Back to article page