Fig. 1: Learning the assembly mechanism of ions in water.
From: Machine-guided path sampling to discover mechanisms of molecular self-organization

a, Mechanism learning by path sampling. The method iterates between sampling transition paths from a configuration x between metastable states A and B (left), and learning the committor pB(x) (right). A neural network function of molecular features (x1 to x4) models the committor. The log predictor forming the last layer is not shown. At convergence, symbolic regression distills an interpretable expression that quantifies the molecular mechanism in terms of selected features (x1, x2) and numerical constants (a, c) connected by mathematical operations (here: +, −, ×, exp). b, Snapshots along a TP showing the formation of a LiCl ion pair (right to left) in an atomistic MD simulation. Water is shown as sticks, Li+ as a small sphere and Cl− as a large sphere. Atoms are colored according to their contribution to the reaction progress from low (blue) to high (red), as quantified by their contribution to the gradient of the reaction coordinate q(x|w). c, Self-consistency. Counts of the generated (blue line) and expected (orange dashed line) number of transition events. The green line shows the cumulative difference between the observed and expected counts. The inset shows a zoom-in on the first 1,000 iterations. d, Validation of the learned committor. Cross-correlation between the committor predicted by the trained network and the committor obtained by repeated sampling from molecular configurations on which the committor model was not trained. The average of the sampled committors (blue line) and their s.d. (orange shaded) were calculated in bins of the learned committor indicated by the vertical steps. For reference, the red line indicates the identity. e, Transferability of the learned committor. Representation of transfer learning, and cross-correlations between sampled committors for NaCl and NaI ion pairing and predictions of committor from a model trained on data for LiCl and adjusted by transfer learning using only 1,000 additional shooting outcomes each. Colors and s.d. (indicated by orange shading) are as in d.