Navigating polymorph generation and distilled-potential development via entropy-symmetry landscapes for metal plasticity mechanisms

Li, Zeyuan; Liu, Taiqiao; Wan, Xuhao; Zhao, Songpeng; Zhang, Zhaofu; Zhou, E.; Yu, Wei; Zuo, Yijing; Robertson, John; Liu, Sheng; Guo, Yuzheng

doi:10.1038/s41467-026-73188-9

Download PDF

Article
Open access
Published: 08 June 2026

Navigating polymorph generation and distilled-potential development via entropy-symmetry landscapes for metal plasticity mechanisms

Zeyuan Li (李泽源) ORCID: orcid.org/0009-0005-9043-2345¹^na1,
Taiqiao Liu (刘太巧)²^na1,
Xuhao Wan (万旭昊)³,
Songpeng Zhao (赵松朋)²,
Zhaofu Zhang (张召富) ORCID: orcid.org/0000-0002-1406-1256²,
E. Zhou (周娥)²,
Wei Yu (余伟)¹,
Yijing Zuo (左怡婧)⁴,
John Robertson⁵,
Sheng Liu (刘胜) ORCID: orcid.org/0000-0001-6033-078X¹ &
…
Yuzheng Guo (郭宇铮) ORCID: orcid.org/0000-0001-9224-3816¹

Nature Communications volume 17, Article number: 5070 (2026) Cite this article

1330 Accesses
Metrics details

Subjects

Abstract

Artificial intelligence has advanced crystal design, yet unifying crystal structure prediction with thermodynamics-driven structure-property modelling remains challenging owing to divergent methodological foundations. Here we show that an integrated framework called PolymorphGen‑MLPKD, driven by physically meaningful entropy-symmetry landscapes, enables targeted generation of polymorphs and concurrent structure-property modelling. The framework captures phase behaviour across diverse crystal systems, generates paracrystalline structures, reveals graphite-diamond transition pathways, and produces knowledge‑distilled potentials that transfer cross‑scale accuracy with a 10⁶‑fold speed enhancement while preserving high generalizability. We uncover data‑efficiency and coverage‑uniformity scaling laws that inform machine learning model training. The resulting distilled potential successfully compares twinning and dislocation‑mediated plasticity across materials and further resolves stress‑induced phase transitions in brittle iridium. By bridging the generation-property gap in crystal artificial intelligence, this work overcomes conventional accuracy-efficiency limitations and provides a streamlined foundation for high‑fidelity atomic simulations.

An AI framework for time series microstructure prediction from processing parameters

Article Open access 05 July 2025

AI-assisted rapid crystal structure generation towards a target local environment

Article Open access 06 January 2026

Exploration of crystal chemical space using text-guided generative artificial intelligence

Article Open access 12 May 2025

Introduction

Crystal materials design, transitioning from traditional trial-and-error approaches toward rational, structure- and data-driven strategies, has become imperative for fields such as condensed matter physics, energy storage, and mechanical engineering^1,2,3. Advanced computational methods provide unprecedented atomistic perspectives to leverage crystal structures for achieving performance breakthroughs^4,5. Owing to rapid advancements in artificial intelligence, machine learning methods have emerged as a pivotal engine, enabling forward structure-property relationship establishment through quantum chemistry (QC), density functional theory (DFT), and molecular dynamics (MD) to elucidate intrinsic processes^6,7, while concurrently facilitating reverse functional design via machine learning-driven crystal structure prediction (CSP) approaches, such as graph neural networks and diffusion models^8,9. The success of AI relies on a data infrastructure that balances high precision, extensive coverage, and low redundancy to ensure robust generalisation¹⁰. Shortfalls in coverage or distribution alignment can amplify extrapolation errors, erode simulation stability, and ultimately undermine predictive reliability across diverse atomic configurations¹¹. In crystal systems, the combinatorial explosion of high-dimensional configurational possibilities, arising from sparse sampling, complex symmetry constraints, and intricate interatomic interactions, leads to a data bottleneck that cannot be resolved by simply enlarging datasets or increasing model complexity^12,13, thereby hindering the generalisation and predictive accuracy of crystal AI.

Given these limitations, recent efforts have focused on efficient sampling methodologies to overcome the critical data bottleneck. Topology-guided sampling leverages persistent homology to systematically navigate atomic configurations across diverse material morphologies, enabling the efficient identification of active phases in systems like Pd hydrides and Pt clusters¹⁴. Symmetry-principle-guided evolutionary algorithms integrate group and graph theory to extract structural features and generate high-quality initial structures, proving highly effective in complex systems such as phosphorus allotropes and diamond-silicon surfaces¹⁵. Graph-deep-learning-based techniques rapidly target low-energy regions on the potential energy surface through innovative potential energy surface slicing, achieving remarkable accuracy with minimal computational samples, as demonstrated in studies of boron allotropes and CuIn₅Se₈¹⁶. However, reliance on symmetry-idealised configurations or neglect of thermal fluctuations and metastable disordered phases limits the capacity to explore vast polymorph conformational spaces¹⁷, while the inherent methodological divergence between minimum potential energy-driven crystal structure prediction and thermodynamics configuration-based structure-property relationship establishment poses significant challenges in constructing a unified crystal design architecture.

In this work, we present PolymorphGen‑MLPKD (Polymorph Generator and Machine Learning Potential Knowledge Distillation), an integrated framework driven by entropy‑symmetry landscapes for topological analysis, targeted generation, and structure‑property relationship establishment of polymorphs. The entropy‑symmetry landscape—a projection onto the instantaneous pair entropy (s_S) and the sixth‑order Steinhardt symmetry parameter (Q₆)—offers an alternative to the lossy compression inherent in conventional high‑dimensional descriptors such as PCA, UMAP, and t‑SNE for topological visualization^18,19,20. PolymorphGen accurately unveils phase behaviours across diverse crystal systems, efficiently generates paracrystalline diamond structures, and constructs a continuous distribution of intermediate states along the graphite-diamond transformation, identifying a pathway with a lower energy barrier than that obtained from nudged elastic band (NEB) calculations. The framework incorporates Auto‑DFT scheduling, multi‑resolution classification, and MLPKD to enable cross‑scale accuracy transfer from density functional theory (DFT) through the message passing neural network (MPNN) to the deep neural network (DNN) with a 10⁶‑fold speed enhancement, based on knowledge distillation. The resulting DNN model trained via knowledge distillation (DNN-KD) accurately captures phonon dispersion, thermodynamic, defect, and elastic properties. Two scaling laws for machine learning potential accuracy are revealed: performance saturates beyond a threshold data density under uniform entropy‑symmetry coverage, yet degrades systematically when coverage is uneven. The DNN‑KD models successfully compare twinning and dislocation‑mediated plasticity in FCC metals and further resolve stress‑induced BCC cluster formation, FCC‑BCC‑HCP phase transitions, and twinning routes in brittle iridium. This work may be of interest to researchers across AI for science, materials science, physics, machine learning, and computational methods, and we anticipate that PolymorphGen‑MLPKD will serve as a guideline for thermodynamic polymorph generation along with MLP data preparation, training, and testing.

Results

Design of PolymorphGen-MLPKD

Establishing a unified crystal AI requires a comprehensive and effective evaluation of structural information. To this end, we propose an entropy-symmetry landscape for polymorphs, which integrates global and local order parameters to quantitatively assess crystal similarity and enable universal topological mapping. For global order, we employ the instantaneous pair entropy (s_S), a parameter derived from liquid state theory that expands the excess entropy per atom into an infinite series of multiparticle correlation functions and is widely used in atomistic simulations to drive crystallization processes^21,22. Local order is characterized by the sixth-order Steinhardt symmetry parameter (Q₆), which quantifies the degree of order within an atom’s first coordination shell and captures short-range orientational correlations among nearest neighbours²³. The instantaneous pair entropy s_S is derived from the two‑body excess entropy and quantifies radial disorder by integrating the mollified radial distribution function. It is sensitive to changes in density correlations (particularly under solid-liquid coexistence) but is relatively insensitive to angular symmetry. By contrast, the sixth‑order Steinhardt parameter Q₆ captures orientational correlations through a spherical harmonic expansion of bond vectors within the first coordination shell. Q₆ distinguishes crystal structures, identifies solid and liquid atoms, and detects defects, yet it is largely insensitive to bond‑length variations. Together, s_S and Q₆ probe orthogonal dimensions of structural order: s_S reflects thermodynamic (radial) features, while Q₆ encodes crystallographic (angular) information. Their combination therefore provides a complementary, information‑preserving projection of the high‑dimensional configurational space onto a two‑dimensional plane, as illustrated in Fig. 1a. To validate the sensitivity of the entropy-symmetry landscape to lattice perturbations, we applied both normal (ε_xx, ε_yy, ε_zz) and shear (ε_xy, ε_yz, ε_zx) strains to an HCP structure with five independent elastic constants. The landscape exhibited symmetric distribution patterns, allowing interpretable analysis of perturbation type, directionality, and magnitude, as detailed in Supplementary Fig. 1. Furthermore, by introducing random atomic displacements into the NaCl structure, we demonstrated that the entropy-symmetry landscape effectively identifies the degree of atomic perturbation, visualized in Supplementary Fig. 2. To further illustrate that s_S and Q₆ can resolve structural information with energy-level resolution even without prior knowledge, we performed a melting MD simulation of silicon and tracked the temporal evolution of energy, s_S, Q₆, and the simple cubic crystal structure parameter. As shown in Supplementary Fig. 3, while the simple cubic parameter, fails to quantify the disordered liquid region after melting, both s_S and Q₆ evolve smoothly and remain strongly correlated with energy throughout the entire trajectory. This confirms that, during continuous structural evolution such as melting, the entropy-symmetry landscape captures progressive disordering with a resolution comparable to energy-based descriptions, relying solely on configurational geometry without requiring energy calculations or prior knowledge of the phases involved.

**Fig. 1: The integrated framework navigating polymorph generation and distilled potential development guided by entropy-symmetry landscapes.**

By integrating the entropy-symmetry landscape with genetic mutation-inspired sampling, we developed PolymorphGen to enable targeted polymorph generation within the s_S-Q₆ plane, thereby overcoming limitations of existing CSP methods, including dataset dependency, neglect of thermal fluctuations, and challenges in exploring metastable disordered materials. PolymorphGen operates in three phases (Fig. 1b): (i) Topological mapping of entropy (s_S) and symmetry (Q₆) parameters of crystal structures onto a two-dimensional plane provides insight into the phase distribution of input structures; (ii) Treating s_S and Q₆ as chromosomal elements and atomic positional relationships as the DNA sequence, we employ atomic displacement (D), cell volume (V), and cell shape (S) as mutation gene segments to regulate genetic variation, as illustrated in Supplementary Movie 1; (iii) Using existing configurations as parent structures, new structures are directionally generated on the entropy-symmetry plane through mutation operations on D, V, and S, integrated with a roulette wheel iteration process. Whereas previous studies relied on metadynamics methods that bias configurational energy, we deliberately introduced entropy-symmetry topological mapping as the core driver of iterative configurational evolution. PolymorphGen provides a comprehensive perspective to orchestrate an efficient genetic algorithm, directly mutating configurations to avoid local optima traps and the inherent error accumulation of first-principles calculations. The genetic mutation parameters D, V, and S, which enable atomic displacement and cell shape variation, constitute the core of the Parrinello-Rahman method and are essential for studying crystal-related processes^22,24. By transforming the traditional structure-temperature-pressure (S-T-P) paradigm of ab initio molecular dynamics (AIMD) into an s_S-Q₆-D-V-S framework (Fig. 1b, highlighted region), PolymorphGen incorporates thermodynamic influences into CSP in a manner not realised by conventional methods, yielding a polymorph configuration library that encompasses stable crystals, metastable intermediates, and thermodynamically disordered states. Critically, unlike existing disordered-structure sampling approaches²⁵, PolymorphGen fundamentally circumvents the need for prohibitively expensive quantum computations and iterative active-learning cycles. This methodological development establishes our strategy as a computationally efficient and scalable alternative for exploring complex phase landscapes.

The configuration library generated by PolymorphGen, which covers thermodynamic phase distributions, provides a data foundation for directly establishing structure-property relationships using CSP methods. Inspired by knowledge distillation techniques from large-scale models, the MLPKD framework offers an architectural basis for achieving efficient and high-quality cross-scale accuracy transfer through the entropy-symmetry landscape. The MLPKD framework operates in three phases (Fig. 1c): (i) Configurations uniformly distributed at low density across the full entropy-symmetry landscape are selected as standard thermodynamics configurations. A high-accuracy reference dataset is obtained using DFT combined with the Auto-DFT scheduling platform (Supplementary Fig. 4). (ii) The DFT dataset is used to train a highly generalizable yet costly MPNN model. Multi-resolution classification generates supplementary thermodynamics configurations based on specific properties, and the MPNN model is used to produce a high-efficiency workbook dataset. (iii) A hybrid-fidelity dataset formed by combining the DFT and MPNN datasets is employed to train a low-cost DNN-KD model suitable for large-scale atomic system simulations. The PolymorphGen-MLPKD framework enables efficient and high-quality knowledge distillation of cross-scale accuracy from complex to simple models, achieving a 10⁶-fold speed enhancement. As shown in Fig. 1d, compared to conventional active learning frameworks²⁶, our framework bridges the critical gap between structure generation and structure-property relationship establishment in crystal AI. Its unidirectional architecture avoids multiple cycles of DFT and MLP computations, overcoming accuracy-efficiency limitations by revolutionising the entire work-flow from polymorph analysis and data preparation to training and testing.

PolymorphGen used for CSP

To demonstrate the broad applicability of our entropy-symmetry topological mapping method, we applied it to transform high-dimensional structures from diverse processes^{27,28,29,30,31}, including multithermal-multibaric simulations of MgSiO₃ perovskite (Fig. 2a), solid-solid phase transitions in Ti₃O₅ (Fig. 2b), ice nucleation (Supplementary Fig. 5a), structural reconfigurations in AgI solid-state electrolytes (Supplementary Fig. 5b), and solidification of GeTe nanoparticles (Supplementary Fig. 5c), into interpretable two-dimensional representations. The selection of phases for annotation was guided by a combination of point distribution patterns and energy correspondence. The entropy-symmetry landscape alone does not automatically delineate phase boundaries; rather, it provides a low-dimensional representation in which structural similarity is reflected by point proximity, enabling interpretable visualization that, when combined with available energy or phase-label information, facilitates the identification of distinct phases and transition pathways. Unlike conventional dimensionality reduction techniques^27,28, our approach not only accurately discriminates distinct phase structures across all systems, but also reveals smooth continuous distributions of intermediate states and well-defined phase transition pathways, using structural information alone without requiring precise first-principles computations. We note that the entropy-symmetry landscape does not automatically partition phase space into discrete clusters, but rather provides a continuous similarity metric in which distances reflect structural differences. This continuity is physically meaningful for datasets obtained from multithermal-multibaric sampling, where intermediate states naturally form a continuum. Where gaps appear in the landscape, they indicate under sampled metastable regions, which are precisely the targets for prediction by PolymorphGen. Particularly compelling is the topological mapping of the Ti₃O₅ solid-solid phase transition dataset (Fig. 2b; Supplementary Fig. 5d), which captures not only the β-λ transition but also transition states TS¹ and TS², the high-temperature α phase, and metadynamics sampling configurations. This coherent behaviour underscores that our structure-based entropy-symmetry mapping is interpretable and broadly applicable, rather than being a lossy or non-unique dimensionality reduction method.

**Fig. 2: PolymorphGen unveils dynamic phase transitions across crystal systems.**

The advancement of PolymorphGen in CSP lies in its ability to recognise and effectively inherit the entropy-symmetry characteristics of existing structures. Through physically-informed genetic mutation perturbations and constraints, it directly generates configurations capable of crossing phase transition barriers without relying on computed parameters such as energy. Taking the generation of paracrystalline diamond structures as an example, the static structure factors of the numerous paracrystalline configurations generated by PolymorphGen show strong agreement with experimental measurements³² (Fig. 2c), demonstrating the method’s reliability. Compared to traditional molecular dynamics (MD), PolymorphGen produces a significantly broader variety of paracrystalline diamond types (Fig. 2d). In addition to conventional MD sampling, comparisons with active learning approaches and metadynamics enhanced sampling methods for silicon crystallization (Supplementary Fig. 6) reveal distinct sampling characteristics. Projection of these sampling trajectories onto the entropy-symmetry landscape shows that active learning misses many metastable states despite multi-temperature sampling, while metadynamics at 1700 K explores regions distinct from standard MD at the same temperature (Supplementary Fig. 6a). As shown in Supplementary Fig. 6b, the configurational ensemble generated by PolymorphGen covers a substantially broader region of the entropy‑symmetry landscape and spans a wider energy range than metadynamics sampling. This increased diversity, together with its thermodynamic representativeness, arises from mutation‑based exploration across the two‑dimensional landscape rather than biased sampling along collective variables. Furthermore, a direct quantitative comparison demonstrates that PolymorphGen reduces the CPU time per configuration by approximately two to three orders of magnitude relative to Gamma‑point DFT (representative of metadynamics) and multiple k‑point DFT (representative of active learning), as detailed in Supplementary Fig. 7. Unlike conventional methods that couple exploration with energy evaluation, PolymorphGen decouples these two stages: configurational exploration proceeds without any energy calculation, enabling batch generation of thousands of candidate structures, while the subsequent DFT evaluation is maximally parallelised via our Auto-DFT scheduling platform. This decoupling is the key to achieving both high efficiency and the ability to discover globally low-energy configurations that lie far from linearly interpolated paths. Furthermore, PolymorphGen enables exploration of phase transition pathways from point to plane: using only five initial structures³³ as parent configurations, it successfully generated a series of configurations depicting the continuous transition from graphite to diamond (Fig. 2e). Energy analysis of these configurations using our Auto-DFT scheduling platform (detailed in Supplementary Fig. 4) revealed smooth and uniform energy variations, encompassing low-energy structures such as graphite and cubic diamond, medium-energy partially crystalline intermediates including graphite-cubic diamond and graphite-other hybrids, and high-energy transition structures. The key transition structures near the cubic diamond phase, such as graphite-cubic diamond-graphite, align with existing studies^33,34, supporting the validity of the generated structures. Notably, the structural reconstruction from graphite to other phases, which were absent in the initial configurations, demonstrates PolymorphGen’s effectiveness in incorporating thermodynamic influences to generate novel structures. We emphasise that PolymorphGen is designed to generate thermodynamically continuous configuration ensembles for MLP training, rather than to directly predict a unique transition pathway. The large number of generated configurations reflects the intrinsic complexity of the configurational space and serves as a rich data resource for training generalizable machine learning potentials. Two phase transition pathways generated by PolymorphGen are presented in Fig.2f: PathI follows the left edge of the landscape, corresponding to the largest energy change between graphite and diamond, while PathII traces the right edge, representing the smallest energy change. Detailed transition structures for both pathways are provided in Supplementary Fig.8. The significant energy differences between them originate from volume changes and atomic rearrangements, mirroring the effects of temperature and pressure in real phase transitions. Notably, PathII exhibits a lower energy barrier than the NEB path³³ computed from the same endpoints (Fig.2f). This demonstrates that mutation-based exploration across the two-dimensional landscape can uncover lower-energy transformation routes missed by linearly interpolated NEB calculations, which serve as the standard initial guess in conventional NEB implementations. The decoupled exploration-evaluation paradigm underlying PolymorphGen enables such discoveries by freeing configurational generation from the local constraints inherent in energy-coupled sampling. This advances CSP research from one‑dimensional pathway exploration to two‑dimensional landscape mapping.

PolymorphGen-MLPKD for structure-property relationship establishment

A distinguishing feature of the PolymorphGen-MLPKD framework in establishing structure-property relationships is its expanded exploration of configurations beyond existing methods. We compared the distribution of configurations from traditional AIMD and those added by our method on the entropy-symmetry landscape (Fig. 3a), visually revealing expanded sampling of metastable and thermodynamically disordered states. To ensure a rigorous and unbiased assessment, we constructed two distinct datasets: one comprising AIMD configurations and the other consisting of PolymorphGen structures. Identical MPNN model architectures were trained on each dataset. The test set included configurations strictly excluded from both training sets, evaluated using the Auto-DFT scheduling platform to provide reference data. As quantified in Fig. 3b, the MPNN model trained on our framework demonstrably outperformed its AIMD‑based counterpart, achieving lower root‑mean‑square errors for energy, force and virial predictions, as well as smaller errors in elastic constants (Supplementary Fig. 9). A more stringent test under large compression, summarised in Supplementary Table 1, further illustrates this improved generalisation: the model trained exclusively on AIMD configurations produced unrealistically large energies at reduced volumes that would lead to computational instability, whereas the PolymorphGen‑trained model closely followed the DFT reference with consistently small errors. The insufficiency of merely broad thermodynamic coverage is further evidenced by a comparison with DP‑GEN, an active learning method. Despite sampling an extensive range of 0–15,500 K and 0–500 GPa and using 983,941 configurations (Supplementary Fig. 10), the DNN model trained on this dataset failed to capture the correct mechanical behaviour of iridium, predicting C₁₂ > C₄₄ in contradiction with experiments³⁵ and the known brittle character of Ir. In contrast, our PolymorphGen‑trained model accurately reproduced the elastic constants and brittle nature. This demonstrates that broad coverage alone is not sufficient; the quality and uniformity of the sampled configurations are equally critical. PolymorphGen‑MLPKD thus provides a comprehensive perspective for MLP research by covering stable, metastable and disordered states, avoiding the limitations of prior knowledge gaps, and enabling fair training and testing on a unified scale. As shown in the test results in Fig. 3c, the equivariant MPNN-based MACE³⁶ model achieved the highest accuracy. We validated PolymorphGen-MLPKD on metals with FCC, BCC, and HCP structures, using dataset sizes of 3417 for Al, 3259 for Ir, 4053 for Mo, and 2202 for Zr. To evaluate the accuracy and efficiency, we compared its predictions with experimental and DFT data across multiple domains, including vacancy formation energy (Fig. 3d), elastic properties (Fig. 3e, Supplementary Table 2), thermodynamic properties (Supplementary Fig. 11a, c), transformation kinetics via Bain deformation paths (Supplementary Fig. 11b), dynamic stability from phonon dispersion (Supplementary Fig. 11d), plastic deformation mechanisms (Supplementary Fig. 11e), and twinning mechanisms (Supplementary Fig. 11f). Furthermore, analyses of the generalized stacking fault energy (GSFE) and generalized planar fault energy (GPFE) curves confirmed that twinning is its dominant deformation mechanism (Supplementary Fig. 11f).

**Fig. 3: Integrated PolymorphGen-MLPKD framework enabling efficient transfer of cross-scale accuracy.**

The multi‑resolution classification strategy within PolymorphGen‑MLPKD offers a distinct advantage over opaque data filtering techniques prevalent in the field by enabling interpretable assessment of data redundancy³⁷. This systematic down‑sampling, illustrated in Fig. 3f and Supplementary Fig. 12, reveals a clear convergence behaviour: once a threshold of approximately 1400 uniformly distributed configurations is reached, further increasing the dataset size yields only marginal improvements in force predictions. Remarkably, models trained on as few as 77 configurations accurately capture iridium’s liquid structure, unique brittleness and phonon spectra, as shown in Fig. 3f and Supplementary Figs. 13–14. This establishes a data‑efficiency scaling law: for a given material with uniform configurational coverage, model error saturates beyond a relatively small training set size, implying that configurational diversity, not merely quantity, is the primary driver of accuracy. Second, we examined how the uniformity of configurational coverage affects model generalisation, a distinct aspect tied to the topology of the entropy‑symmetry landscape, and uncovered a coverage‑uniformity scaling law. Using a deliberately biased dataset with gaps in thermodynamic coverage (Fig. 3g inset), we found that prediction errors concentrated precisely in the under‑sampled regions (Fig. 3g centre), and overall RMSEs remained consistently higher than those from uniformly sampled data regardless of total dataset size (Fig. 3g around). This demonstrates that predictive performance is critically sensitive to how evenly training configurations populate the entropy‑symmetry landscape; incomplete or uneven sampling leads to systematic extrapolation errors even with large datasets. Together, these two complementary scaling laws underscore that the entropy‑symmetry landscape provides a powerful low‑dimensional descriptor for assessing and designing optimal training datasets, enabling both data efficiency and robust generalisation in MLP development.

We observed that polymorphs of the same crystal type exhibit highly similar distributions on the entropy-symmetry landscape. This led us to propose that structure-property relationships for isotypic structures can be directly trained by replacing element types within a consistent thermodynamic configuration library. As shown in Supplementary Fig. 15a, this concept was validated using the FCC-Al model trained on FCC-Ir configurations (Supplementary Fig. 15b) and the FCC-IrRe doping model (Supplementary Fig. 15c). Complex models can accurately capture structure-property relationships with small datasets, but their application in large atomic systems is constrained by inherent architectural limitations, such as the exponential increase in computational complexity with the number of atomic neighbours introduced by equivariant message passing in models like MACE. PolymorphGen-MLPKD transfers DFT-level accuracy from MPNN to DNN without changing the model architecture. Compared to DNN models trained directly on DFT datasets, the DNN-KD model, which is distilled from the MPNN teacher, mitigates energy biases typical in small-dataset training. It exhibits smaller, more concentrated errors in force and virial predictions (Fig. 3h), reduces RMSE by approximately 30%, and delivers more accurate predictions of dynamic stability (Fig. 3i).

Brittle metal twinning routes explored by the distilled-potential

We applied PolymorphGen-MLPKD to develop a DNN-KD model for iridium, in which MPNN was used to enhance its phonon scattering properties (Supplementary Fig. 16). The purpose of transferring cross-scale accuracy to the DNN-KD model was to investigate emergent mechanical behaviours, such as dislocations and twinning, that are only observable through large-scale atomic simulations. To demonstrate the universality of our framework across different deformation mechanisms, we extended the same work-flow to aluminum, a system known to deform via dislocation-mediated plasticity³⁸ in stark contrast to the twinning-dominated behaviour of iridium. The comparative results are summarised in Fig. 4a, b. Strikingly, while aluminum ultimately undergoes conventional Shockley partial dislocation slip via intrinsic stacking fault (ISF) nucleation and expansion, our simulations reveal a far richer incipient plasticity pathway. Upon ISF nucleation and propagation into stress-concentrated regions, the faulted structure transiently transforms into nanoscale twins of 1–2 atomic layer thickness, accompanied by the concurrent nucleation of additional ISFs at other locations. As strain proceeds, the system undergoes multiple twinning and subsequent detwinning events, eventually evolving into steady-state dislocation slip. This complex, transient twinning-detwinning cascade during the early stage of plasticity in aluminum is resolved in detail through our framework. The full dynamical evolution is provided in Supplementary Movie 2 (iridium) and Supplementary Movie 3 (aluminum) for direct comparison.

**Fig. 4: Twinning route of brittle metal iridium explored by the distilled-potential.**

Experimental observations confirm that iridium deforms by twinning, but the specific twinning route remains unexplored³⁹. We employed micro-pillar compression MD simulations to directly capture the dynamic formation mechanism of three-layer twins; the micro-pillar model is shown in Supplementary Fig. 17. Within the elastic range of compression, we observed the emergence of BCC cluster defects in iridium (Supplementary Fig. 18). Moreover, the structure transitions through a BCC phase during the stress-induced FCC-to-HCP phase transformation (Supplementary Fig. 19a). The complete FCC-BCC-HCP transition pathway is detailed in Supplementary Fig. 19b and is consistent with compression studies in FCC high-entropy alloys⁴⁰. The population of these BCC cluster defects exhibits temperature dependence, progressively increasing with rising temperature as shown in Fig. 4c. In Fig. 4d, three consecutive (111) planes labelled A, B, and C represent the perfect crystalline repeat units in FCC iridium. Shockley dislocation emission transforms the FCC stacking sequence from ABCABCA to ABC’A’BCA containing BCC cluster defects, and further into the stacking fault sequence ABCBCAB. With increasing applied stress, the atomic stacking order evolves from the continuous stacking fault sequence ABCBCAB to ABCBABC, where an FCC layer is sandwiched between two coherent twin boundaries (CTB). This state further develops into a three-layer twin, in which the stacking order changes from ABCBABC (with an FCC layer between two CTBs) to BACBABC (with two atomic layers between two CTBs), ultimately forming a structure with eight atomic layers between two CTBs (Supplementary Fig. 20). This twinning route is identical to the novel 1-3-2 twinning pathway observed via high-resolution TEM (HRTEM) in metals with high Intrinsic stacking fault energy⁴¹. This work demonstrates the applicability and practical utility of PolymorphGen-MLPKD for establishing high-accuracy structure-property relationships through direct CSP-based structure prediction in the study of mechanical behaviour in large-scale atomic systems.

Discussion

In conclusion, we present a universal framework, PolymorphGen-MLPKD, that bridges the critical gap between crystal structure prediction and structure-property relationship establishment in crystal AI, overcoming long-standing accuracy-efficiency limitations. Central to this framework is the entropy‑symmetry landscape, a two‑dimensional projection of the high‑dimensional configurational space onto coordinates defined by two physically meaningful, structure‑based parameters: the instantaneous pair entropy (s_S), which reflects thermodynamic (radial) features, and the sixth‑order Steinhardt symmetry parameter (Q₆), which encodes crystallographic (angular) information. Their combination provides a complementary, information‑preserving representation that offers a comprehensive and invariant perspective, unveiling dynamic phase transitions across diverse systems, including perovskites, solid‑solid transformations, nanoparticle solidification, and ice nucleation. The framework’s capability in crystal structure prediction is demonstrated through its improved efficiency and configurational diversity relative to existing sampling methods such as MD, AIMD, active learning, metadynamics, and NEB calculations, as evidenced by explorations of paracrystalline diamond and graphite‑to‑diamond transition pathways. Multi‑resolution screening of the entropy‑symmetry landscape reveals two complementary scaling laws for machine learning potentials: data‑efficiency saturation under uniform coverage and performance degradation arising from coverage bias. The utility of the entropy-symmetry landscape as a structural similarity metric is independently validated by the scaling-law analysis, where coverage uniformity on the landscape directly predicts MLP generalisation error, confirming that s_S and Q₆ provide a reproducible and objective measure of configurational proximity. Our knowledge distillation approach enables cross‑scale accuracy transfer from DFT through MPNN to DNN‑KD with a 10⁶‑fold speed enhancement while maintaining high generalisation capability. The practical utility of PolymorphGen‑MLPKD is confirmed through investigations of distinct plasticity mechanisms, namely dislocation‑mediated slip in aluminum and twinning‑dominated deformation in brittle iridium, demonstrating its applicability across materials with fundamentally different deformation modalities. The core innovation lies in the physics-informed generation and selection of polymorphs, which strategically directs the most representative structures to models of varying complexity, thereby ensuring the accuracy and generalizability of crystal AI at the data level. Our strategy of using fixed standard thermodynamic configurations for isotypic structures effectively controls configuration, one of the key variables along with composition and relative positioning, in MLP training for doped and even high-entropy alloys, substantially mitigating data explosion in structure-property mapping. PolymorphGen-MLPKD will serve as a guideline for thermodynamic polymorph generation and MLP data preparation, training, and testing, laying the critical foundation for the next generation of high-fidelity atomic simulations.

Methods

Entropy-symmetry landscape

We employ sixth-order Steinhardt symmetry parameters (Q₆) to quantify the short-range order of the system²³. This parameter encodes the bond-orientational order between each central particle and its nearest neighbours into spherical harmonics to sensitively capture local symmetry features, without being limited by the absence of long-range order. For each particle $i$ and its bond with neighbour $j$, represented by a vector ${{{{\bf{r}}}}}_{{{{\bf{ij}}}}}$, the local bond-order vector ${q}_{{lm}}$ captures the orientation information of the local environment of particle $i$. It is defined as the average projection of the directional information of all bonds onto spherical harmonics:

$${q}_{{lm}}\left(i\right)=\frac{1}{{N}_{b}\left(i\right)}\mathop{\sum }_{j=1}^{{N}_{b}\left(i\right)}\,{Y}_{{lm}}\left({\theta }_{{ij}},{\phi }_{{ij}}\right),$$

(1)

where ${Y}_{{lm}}$ are the standard spherical harmonics, which describe patterns of angular distribution in three-dimensional space and are determined by two integer indices $l$ and $m$. The spherical harmonics are expressed as:

$${Y}_{{lm}}\left(\theta,\phi \right)={\left(-1\right)}^{m}\sqrt{\frac{\left(2l+1\right)\left(l-m\right)!}{4\pi \left(l+m\right)}}{P}_{l}^{m}\left(\cos \theta \right){e}^{{im}\,\phi },$$

(2)

where ${P}_{l}^{m}$ are the associated Legendre polynomials, and different l values correspond to different orders of spatial symmetry. ${N}_{b}\left(i\right)$ is the number of nearest neighbours of particle $i$, and $\left({\theta }_{{ij}},{\phi }_{{ij}}\right)$ is the spherical coordinate of the bond vector ${{{{\bf{r}}}}}_{{{{\bf{ij}}}}}$ connecting central particle $i$ and neighbour $j$ in a preset global reference frame. To obtain a physical quantity that reflects the strength of local symmetry and is independent of the choice of coordinate system, Steinhardt introduced the rotation-invariant local order parameter ${q}_{l}\left(i\right)$.This parameter is constructed by summing the squares of all $l$ components and normalising, thereby eliminating dependence on coordinate rotation:

$${q}_{l}\left(i\right)=\sqrt{\frac{4\pi }{2l+1}\mathop{\sum }_{m=-l}^{l}\,{\left|{q}_{{lm}}\left(i\right)\right|}^{2}}.$$

(3)

By averaging the local bond-order vectors, we aim to describe the overall order of the system:

$$\left\langle {q}_{{lm}}\right\rangle=\frac{1}{N}\mathop{\sum }_{i=1}^{N}\,{q}_{{lm}}\left(i\right).$$

(4)

For the $l$-th order Steinhardt symmetry parameter ${Q}_{l}$, it is defined as:

$${Q}_{l}=\sqrt{\frac{4\pi }{2l+1}\mathop{\sum }_{m=-l}^{l}\,{\left|\left\langle {q}_{{lm}}\right\rangle \right|}^{2}}.$$

(5)

Here, we choose Q₆, the average bond-orientational order at angular wave number $l=6$, to quantify the system, because sixth-order spherical harmonics can effectively capture and distinguish common crystal symmetries.

Global order is quantified using liquid state theory, where the excess entropy per atom is expressed as an infinite series of multiparticle correlation functions²¹, with the two-body term defined as:

$${S}_{2}=-2\pi \rho {k}_{B}\int _{0}^{\infty }\,[g\left(r\right){{{\mathrm{ln}}}} \, g\left(r\right)-g\left(r\right)+1]{r}^{2}{dr},$$

(6)

where $g\left(r\right)$ is the radial distribution function and $\rho$ is the density of the system. Here, we employ a mollified version of the radial distribution function

$${g}_{m}\left(r\right)=\frac{1}{4\pi N\rho {r}^{2}}\mathop{\sum }_{i\ne j}\,\frac{1}{\sqrt{2\pi {\sigma }^{2}}}{e}^{-{\left(r-{r}_{{ij}}\right)}^{2}/\left(2{\sigma }^{2}\right)},$$

(7)

as defined by Parrinello et al.²², to compute the instantaneous entropy (s_S), where ${r}_{{ij}}$ is the distance between particles $\left(i\right)$ and $\left(j\right)$, and $\sigma$ is a broadening parameter. The cutoff distance ${r}_{\max }$ is chosen to optimise numerical integration of the mollified radial distribution function ${g}_{m}\left(r\right)$, which ensures continuous derivatives with respect to atomic positions, into Eq. (6) using the trapezoid rule.

First-principles calculation

The first-principles calculations were performed using the Vienna ab initio simulation package (VASP)⁴² v6.4.3. The ion-electron interactions were described using the Projector Augmented Wave (PAW) basis set, with a cutoff energy of 600 eV⁴³. The Perdew-Burke-Ernzerhof (PBE) functional within the Generalised Gradient Approximation (GGA) framework was employed for the exchange-correlation interactions⁴⁴. The energy and force convergence tolerance of geometry relaxation was 10^-6 eV, and 0.01 eV/Å, respectively. VASPKIT⁴⁵ was used to generate k-points files with a reciprocal space resolution of 2π × 0.04 Å⁻¹. Structures requiring first-principles computations are automatically processed by a high-throughput computation scheduling platform, which manages the entire work-flow, including input file generation, computations across multiple CPU and GPU nodes, and result retrieval.

The initial configurations for exploring the structure-property dynamics of FCC/BCC/HCP metals were created with Atomsk⁴⁶ using their experimental lattice constants and crystal types. These configurations were then used in on-the-fly machine learning accelerated AIMD simulations performed with VASP, which employed a Bayesian learning algorithm. An isothermal-isobaric (NPT) ensemble, employing a Langevin thermostat based on the Parrinello-Rahman algorithm^24,47, was utilized to fully melt systems consisting of 108-atom Ir, 128-atom Mo, 108-atom Al, and 128-atom Zr at a pressure of 10 GPa, followed by cooling to a temperature of 300 K, with each stage lasting 200 ps. Configurations obtained exclusively from these AIMD simulations were used for topological analysis and as starting points for structure generation through genetic mutation, ensuring comprehensive sampling of the phase landscape.

MLPs construction and validation

The MPNN, DNN, and DNN-KD models were trained using the DeePMD-kit⁴⁸ v3.0.1 package, which employs a plugin mechanism to integrate diverse models, enabling fair comparisons under unified datasets and training parameters. For DNN and DNN-KD, the classic ‘se_e2_a’ descriptor type was utilized, with model compression and fine-tuning applied after each training cycle. The descriptor was constructed using a neural network with layers containing 32, 64, and 128 neurons, respectively. The fitting network consisted of six layers with 240, 240, 240, 240, and 240 neurons, ensuring robust learning of complex interatomic interactions. In contrast, the MPNN model adopted the ‘mace’ descriptor type within the DeePMD-kit v3 package, incorporating 128 equivariant messages. All models employed a cutoff radius of 8.00 Å to capture interatomic interactions and used an exponentially decaying learning rate, starting at 0.001 and decreasing to 3.51 × 10^-8, with distinct loss function weights assigned for optimizing energy, force, and virial terms.

All validations of the machine learning potentials were performed using the open-source code Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS⁴⁹) 29 Aug 2024 version, with standardized workflows for specific property evaluations implemented through auxiliary packages employing LAMMPS as the computational solver. Elastic constants were derived using the auto-test module of the DP-GEN²⁶ package, while Bain path, γ surface, generalized planar fault energy (GPFE), and generalized stacking fault energy (GSFE) curves were calculated via the Lava Wrapper package⁵⁰.

MD setup

All MD simulations were conducted using the open-source code LAMMPS, integrated with our trained DNN-KD model. For the micro-pillar compression MD simulation, the simulation volume was constructed with a diameter-to-height ratio of 1:1:1.6 and oriented along the principal crystallographic directions [100], [010], and [001] of the FCC lattice, as shown in Supplementary Fig. 17. The simulation temperature was set to 300 K. After energy minimization and NPT relaxation for 100 ps, uniaxial compression along the z-axis was performed with a strain rate of 5.0 × 10⁷ s^-1. Fixed layers with a thickness of 6 Å were applied at the top and bottom along the z-axis to inhibit the periodic propagation of dislocations, thereby preventing artificial interactions across periodic boundaries and ensuring the accuracy of the plastic deformation simulation. The Open Visualisation Tool (OVITO)⁵¹ was used to visualise the plastic deformation process in the MD simulations, specifically to identify FCC planar faults.

Paracrystalline diamond structure generation

The initial configuration was a 2 × 2 × 2 supercell of C₆₀ (mp-1196583) from the Materials Project, comprising 1920 atoms. Conventional MD simulations were performed using LAMMPS⁴⁹ with the Tersoff potential⁵². After energy minimization, the system was heated from 300 K to 5000 K over 1 ns at 50 GPa under the NPT ensemble, held at 5000 K for 1 ns, and then cooled to 300 K with pressure reduced to 30 GPa over 1 ns. Structures from this MD trajectory were selected as the initial configuration library for polymorph generation. In PolymorphGen, the cutoff for the symmetry parameter (Q₆) was set to 2.4 Å, and the program automatically computed the initial entropy-symmetry landscape. The landscape range for genetically mutated configurations was expanded by 10% from the current values. For paracrystalline diamond, each target point was evolved over 30 generations with up to 50 individuals per generation.

Graphite-to-diamond transition exploration

Initial configurations were taken from ref. ³³, with a Q₆ cutoff of 3.6 Å. The landscape range for mutation was set equal to the current entropy-symmetry values. Ten evenly spaced target points were automatically selected along each edge of the landscape. Each target was evolved over 30 generations with up to 50 individuals per generation. The procedure generated 117,799 configurations. After multi-resolution classification with screening thresholds of 0.1 and 0.005 on the entropy-symmetry landscape, 6724 configurations were retained in the first round, and their energy distribution was computed using the Auto-DFT platform.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Source data are provided with this paper.

Code availability

The code for PolymorphGen‑MLPKD and the Auto‑DFT scheduling platform, along with example files, trained machine learning potentials, and key molecular dynamics trajectories for the comparison of metal plasticity mechanisms, are available at https://github.com/LZYUCL/PolymorphGen‑MLPKD and in the Zenodo repository⁵³.

References

Zhou, Z. Y. et al. Manipulation of the altermagnetic order in CrSb via crystal symmetry. Nature 638, 645–650 (2025).
Article ADS CAS PubMed Google Scholar
Han, G. P. et al. Superionic lithium transport via multiple coordination environments defined by two-anion packing. Science 383, 739–745 (2024).
Article ADS CAS PubMed Google Scholar
Dan, C. Y. et al. Achieving ultrahigh fatigue resistance in AlSi₁₀Mg alloy by additive manufacturing. Nat. Mater. 22, 1182–1188 (2023).
Article CAS PubMed Google Scholar
Louie, S. G., Chan, Y.-H., Jornada, F. H. D., Li, Z. & Qiu, D. Y. Discovering and understanding materials through computation. Nat. Mater. 20, 728–735 (2021).
Article CAS PubMed Google Scholar
Liu, Y., Madanchi, A., Anker, A. S., Simine, L. & Deringer, V. L. The amorphous state as a frontier in computational materials design. Nat. Rev. Mater. 10, 228–241 (2025).
Article Google Scholar
Zhang, C. W. et al. Advancing nonadiabatic molecular dynamics simulations in solids with E(3) equivariant deep neural hamiltonians. Nat. Commun. 16, 2033 (2025).
Article ADS PubMed PubMed Central Google Scholar
Cao, G. H. et al. Liquid metal for high-entropy alloy nanoparticles synthesis. Nature 619, 73–77 (2023).
Article ADS CAS PubMed Google Scholar
Zeni, C. et al. A generative model for inorganic materials design. Nature 639, 624–634 (2025).
Meng, J. et al. Computational discovery of fast interstitial oxygen conductors. Nat. Mater. 23, 1252–1258 (2024).
Article CAS PubMed Google Scholar
Ben Mahmoud, C., Gardner, J. L. A. & Deringer, V. L. Data as the next challenge in atomistic machine learning. Nat. Comput. Sci. 4, 384–387 (2024).
Article PubMed Google Scholar
Cui, T. Y. et al. Online test-time adaptation for better generalization of interatomic potentials to out-of-distribution data. Nat. Commun. 16, 1891 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Park, H., Li, Z. & Walsh, A. Has generative artificial intelligence solved inverse materials design? Matter 7, 2355–2367 (2024).
Article CAS Google Scholar
Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Zheng, S. S. et al. Active phase discovery in heterogeneous catalysis via topology-guided sampling and machine learning. Nat. Commun. 16, 2542 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Han, Y. et al. Efficient crystal structure prediction based on the symmetry principle. Nat. Comput. Sci. 5, 255–267 (2025).
Article PubMed Google Scholar
Li, C. N. et al. LoreX: a low-energy region explorer boosts efficient crystal structure prediction. J. Am. Chem. Soc. 147, 9544–9555 (2025).
Article ADS CAS PubMed Google Scholar
Wang, Y. Boosting crystal structure prediction via symmetry. Nat. Comput. Sci. 5, 192–193 (2025).
Article PubMed Google Scholar
Ko, T. W. & Ong, S. P. Data-efficient construction of high-fidelity graph deep learning interatomic potentials. npj Comput. Mater. 11, 65 (2025).
Article CAS Google Scholar
Banjade, H. R. et al. Structure motif-centric learning framework for inorganic crystalline systems. Sci. Adv. 7, eabf1754 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Wan, X. H. et al. Machine-learning-assisted discovery of highly efficient high-entropy alloy catalysts for the oxygen reduction reaction. Patterns 3, 100553 (2022).
Article CAS PubMed PubMed Central Google Scholar
Nettleton, R. E. & Green, M. S. Expression in terms of molecular distribution functions for the entropy density in an infinite system. J. Chem. Phys. 29, 1365–1370 (1958).
Article ADS CAS Google Scholar
Piaggi, P. M., Valsson, O. & Parrinello, M. Enhancing entropy and enthalpy fluctuations to drive crystallization in atomistic simulations. Phys. Rev. Lett. 119, 015701 (2017).
Article ADS PubMed Google Scholar
Lechner, W. & Dellago, C. Accurate determination of crystal structures based on averaged local bond order parameters. J. Chem. Phys. 129, 114707 (2008).
Article ADS PubMed Google Scholar
Parrinello, M. & Rahman, A. Polymorphic transitions in single crystals: a new molecular dynamics method. J. Appl. Phys. 52, 7182–7190 (1981).
Article ADS CAS Google Scholar
Camino, B. et al. Exploring the thermodynamics of disordered materials with quantum computing. Sci. Adv. 11, eadt7156 (2025).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, Y. Z. et al. DP-GEN: A concurrent learning platform for the generation of reliable deep learning based potential energy models. Comput. Phys. Commun. 253, 107206 (2020).
Article MathSciNet CAS Google Scholar
Deng, J., Niu, H., Hu, J., Chen, M. & Stixrude, L. Melting of MgSiO₃ determined by machine learning potentials. Phys. Rev. B 107, 064103 (2023).
Article ADS CAS Google Scholar
Liu, M. et al. Layer-by-layer phase transformation in Ti₃O₅ revealed by machine-learning molecular dynamics simulations. Nat. Commun. 15, 3079 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Niu, H., Yang, Y. I. & Parrinello, M. Temperature dependence of homogeneous nucleation in ice. Phys. Rev. Lett. 122, 245501 (2019).
Article ADS CAS PubMed Google Scholar
Hajibabaei, A., Baldwin, W. J., Csányi, G. & Cox, S. J. Symmetry breaking in the superionic phase of silver iodide. Phys. Rev. Lett. 134, 026306 (2025).
Article ADS CAS PubMed Google Scholar
Acharya, D., Abou El Kheir, O., Perego, S., Campi, D. & Bernasconi, M. Atomistic simulations of the crystallization of amorphous GeTe Nanoparticles. J. Phys. Chem. C 128, 19380–19391 (2024).
Article CAS Google Scholar
Tang, H. et al. Synthesis of paracrystalline diamond. Nature 599, 605–610 (2021).
Article ADS CAS PubMed Google Scholar
Xie, Y. P., Zhang, X. J. & Liu, Z. P. Graphite to diamond: origin for kinetics selectivity. J. Am. Chem. Soc. 139, 2545–2548 (2017).
Article ADS CAS PubMed Google Scholar
Luo, K. et al. Coherent interfaces govern direct transformation from graphite to diamond. Nature 607, 486–491 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Cawkwell, M. J., Nguyen-Manh, D., Woodward, C., Pettifor, D. G. & Vitek, V. Origin of brittle cleavage in iridium. Science 309, 1059–1062 (2005).
Article ADS CAS PubMed Google Scholar
Batatia, I., Kovacs, D. P., Simm, G., Ortner, C. & Csányi, G. MACE: higher order equivariant message passing neural networks for fast and accurate force fields. Adv. Neural Inf. Process. Syst. 35, 11423–11436 (2022).
Article Google Scholar
Finkbeiner, J., Tovey, S. & Holm, C. Generating minimal training sets for machine learned potentials. Phys. Rev. Lett. 132, 167301 (2024).
Article ADS CAS PubMed Google Scholar
Vamsi, K. V., Charpagne, M. A. & Pollock, T. M. High-throughput approach for estimation of intrinsic barriers in FCC structures for alloy design. Scr. Mater. 204, 114126 (2021).
Article CAS Google Scholar
Adamesku, R., Grebenkin, S., Yermakov, A. & Panfilov, P. On mechanical twinning in iridium under compression at room temperature. J. Mater. Sci. Lett. 13, 865–867 (1994).
Article CAS Google Scholar
Cao, F. H., Wang, Y. J. & Dai, L. H. Novel atomic-scale mechanism of incipient plasticity in a chemically complex CrCoNi medium-entropy alloy associated with inhomogeneity in local chemical environment. Acta Mater 194, 283–294 (2020).
Article ADS CAS Google Scholar
Wang, L. H. et al. New twinning route in face-centered cubic nanocrystalline metals. Nat. Commun. 8, 2142 (2017).
Article ADS PubMed PubMed Central Google Scholar
Kresse, G. & Furthmuller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996).
Article ADS CAS Google Scholar
Blochl, P. E. Projector augmented-wave method. Phys. Rev. B 50, 17953–17979 (1994).
Article ADS CAS Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Article ADS CAS PubMed Google Scholar
Wang, V., Xu, N., Liu, J. C., Tang, G. & Geng, W. T. VASPKIT: a user-friendly interface facilitating high-throughput computing and analysis using VASP code. Comput. Phys. Commun. 267, 108033 (2021).
Article CAS Google Scholar
Hirel, P. Atomsk: A tool for manipulating and converting atomic data files. Comput. Phys. Comm. 197, 212–219 (2015).
Article ADS CAS Google Scholar
Parrinello, M. & Rahman, A. Crystal structure and pair potentials: a molecular-dynamics study. Phys. Rev. Lett. 45, 1196–1199 (1980).
Article ADS CAS Google Scholar
Zeng, J. et al. DeePMD-kit v3: a multiple-backend framework for machine learning potentials. J. Chem. Theory Comput. 21, 4375–4385 (2025).
Article CAS PubMed PubMed Central Google Scholar
Thompson, A. P. et al. LAMMPS-a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comput. Phys. Commun. 271, 108171 (2022).
Article CAS Google Scholar
Dang, K., Chen, J., Rodgers, B. & Fensin, S. LAVA 1.0: a general-purpose python toolkit for calculation of material properties with LAMMPS and VASP. Comput. Phys. Commun. 286, 108667 (2023).
Article CAS Google Scholar
Stukowski, A. Visualization and analysis of atomistic simulation data with OVITO-the open visualization tool. Modell. Simul. Mater. Sci. Eng. 18, 015012 (2010).
Article ADS Google Scholar
Devanathan, R., Rubia, T. & Weber, W. J. Displacement threshold energies in β-SiC. J. Nucl. Mater. 253, 47–52 (1998).
Article ADS CAS Google Scholar
Li, Z. Y. Navigating polymorph generation and distilled-potential development via entropy-symmetry landscapes for metal plasticity mechanisms, PolymorphGen-MLPKD, https://doi.org/10.5281/zenodo.19734703 (2026).
Ma, P. W. & Dudarev, S. L. Nonuniversal structure of point defects in face-centered cubic metals. Phys. Rev. Mater. 5, 013601 (2021).
Article CAS Google Scholar
Mattsson, T. R. & Mattsson, A. E. Calculating the vacancy formation energy in metals: Pt, Pd, and Mo. Phys. Rev. B 66, 214110 (2002).
Article ADS Google Scholar
Varvenne, C., Mackain, O. & Clouet, E. Vacancy clustering in zirconium: an atomic-scale study. Acta Mater 78, 65–77 (2014).
Article ADS CAS Google Scholar
Kamm, G. N. & Alers, G. A. Low-temperature elastic moduli of aluminum. J. Appl. Phys 35, 327–330 (1964).
Article ADS CAS Google Scholar
Zhou, Y. X. et al. Probing the mechanical properties of ordered and disordered Pt-Ir alloys by first-principles calculations. Phys. Lett. A 405, 127424 (2021).
Article CAS Google Scholar
Akella, J. High-pressure studies on iridium to 30.0 GPa. J. Phys. Chem. Solids 43, 941 (1982).
Article ADS CAS Google Scholar
Hecker, S. S., Rohr, D. L. & Stein, D. F. Brittle fracture in iridium. Metall. Trans. A 9, 481–488 (1978).
Article Google Scholar
Meyers, M. A. & Chawla, K. K. Mechanical behavior of materials (Cambridge University Press, Cambridge, England, 2008).
Simmons, G. & Wang, H. F. Single crystal elastic constants and calculated aggregate properties: a handbook (MIT Press, Cambridge, MA, 1971).
Fisher, E. S. & Renken, C. J. Single-crystal elastic moduli and the hcp → bcc transformation in Ti, Zr, and Hf. Phys. Rev. 135, A482–A494 (1964).
Article ADS Google Scholar

Download references

Acknowledgements

The research was supported by the Natural Science Foundation of China (Grants U2241244, T2525025, and 62361166628), the National Key R&D Program of China (No. 2022YFB3207100), and the Hubei Provincial Strategic Scientist Training Plan (No. 2022EJD009). We also thank the Supercomputing Center of WHU for their support of the calcsulation.

Funding

This work was funded by the Natural Science Foundation of China (Grants U2241244, T2525025, and 62361166628 to Y.Z.G.), the National Key R&D Program of China (No. 2022YFB3207100 to S.L.), and the Hubei Provincial Strategic Scientist Training Plan (No. 2022EJD009) to S.L.

Author information

These authors contributed equally: Zeyuan Li, Taiqiao Liu.

Authors and Affiliations

School of Power and Mechanical Engineering, Wuhan University, Wuhan, China
Zeyuan Li (李泽源), Wei Yu (余伟), Sheng Liu (刘胜) & Yuzheng Guo (郭宇铮)
School of Integrated Circuits, Wuhan University, Wuhan, China
Taiqiao Liu (刘太巧), Songpeng Zhao (赵松朋), Zhaofu Zhang (张召富) & E. Zhou (周娥)
Department of Physics and Astronomy, Rutgers University, Piscataway, USA
Xuhao Wan (万旭昊)
School of Physics and Technology, Wuhan University, Wuhan, China
Yijing Zuo (左怡婧)
Department of Engineering, University of Cambridge, Cambridge, UK
John Robertson

Authors

Zeyuan Li (李泽源)
View author publications
Search author on:PubMed Google Scholar
Taiqiao Liu (刘太巧)
View author publications
Search author on:PubMed Google Scholar
Xuhao Wan (万旭昊)
View author publications
Search author on:PubMed Google Scholar
Songpeng Zhao (赵松朋)
View author publications
Search author on:PubMed Google Scholar
Zhaofu Zhang (张召富)
View author publications
Search author on:PubMed Google Scholar
E. Zhou (周娥)
View author publications
Search author on:PubMed Google Scholar
Wei Yu (余伟)
View author publications
Search author on:PubMed Google Scholar
Yijing Zuo (左怡婧)
View author publications
Search author on:PubMed Google Scholar
John Robertson
View author publications
Search author on:PubMed Google Scholar
Sheng Liu (刘胜)
View author publications
Search author on:PubMed Google Scholar
Yuzheng Guo (郭宇铮)
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.Z.G., S.L. and Z.Y.L. conceived the research concept. Y.Z.G., S.L. and J.R. supervised the research. Z.Y.L. designed the main work-flow, developed the code, and performed all computations. T.Q.L. analyzed the graphite-to-diamond transition pathways and prepared all figures and tables. S.P.Z. developed the genetic mutation program and assisted in refining the overall code. E.Z. conducted the phonon spectrum calculations. Z.Z.F., X.H.W. and W.Y. contributed to the DFT calculations. Y.J.Z. collected and organized relevant data. Y.Z.G., S.L., Z.Y.L. and T.Q.L. co-wrote the manuscript. All the authors contributed to data analysis and scientific discussion.

Corresponding authors

Correspondence to Sheng Liu (刘胜) or Yuzheng Guo (郭宇铮).

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Description of Additional Supplementary File (download PDF )

Supplementary Movie 1 (download MP4 )

Supplementary Movie 2 (download MP4 )

Supplementary Movie 3 (download MP4 )

Reporting Summary (download PDF )

Transparent Peer Review file (download PDF )

Source data

Source Data (download ZIP )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Li, Z., Liu, T., Wan, X. et al. Navigating polymorph generation and distilled-potential development via entropy-symmetry landscapes for metal plasticity mechanisms. Nat Commun 17, 5070 (2026). https://doi.org/10.1038/s41467-026-73188-9

Download citation

Received: 29 September 2025
Accepted: 05 May 2026
Published: 08 June 2026
Version of record: 08 June 2026
DOI: https://doi.org/10.1038/s41467-026-73188-9