Optimization of Coulomb energies in gigantic configurational spaces of multi-element ionic crystals

Köster, Konstantin; Binninger, Tobias; Kaghazchi, Payam

doi:10.1038/s41524-025-01690-7

Download PDF

Article
Open access
Published: 01 July 2025

Optimization of Coulomb energies in gigantic configurational spaces of multi-element ionic crystals

Konstantin Köster^1,2,
Tobias Binninger³ &
Payam Kaghazchi^1,2

npj Computational Materials volume 11, Article number: 202 (2025) Cite this article

1886 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Most of the novel energy materials contain multiple elements occupying a single site in their lattice. The exceedingly large configurational space of these materials imposes challenges in determining low(est) energy structures. Coulomb energies of possible configurations generally show a satisfactory correlation to computed energies at higher levels of theory and thus allow to screen for minimum-energy structures. Employing an expansion into a binary optimization problem, we obtain an efficient Coulomb energy optimizer using Monte Carlo and Genetic Algorithms. The presented optimization package, GOAC (Global Optimization of Atomistic Configurations by Coulomb), can achieve a speed up of several orders of magnitude compared to existing software. In this work, heuristic optimization on various material classes is performed. Thus, GOAC provides an efficient method for constructing low-energy atomistic models for ionic multi-element materials with gigantic configurational spaces.

Temperature relaxation in strongly-coupled binary ionic mixtures

Article Open access 10 January 2022

Optimality guarantees for crystal structure prediction

Article 05 July 2023

Critical assessment of G₀W₀ calculations for 2D materials: the example of monolayer MoS₂

Article Open access 18 April 2024

Introduction

Many state-of-the-art solid-state high performance materials are composed of several different types of elements sharing the same lattice sites. Examples for application areas are, but not limited to, energy conversion and storage systems^{1,2,3,4,5,6,7,8,9,10,11} as well as other special-purpose applications^{12,13,14,15,16}. In some of the most interesting materials for these applications (e.g., layered oxides, ionic conductors), numerous element types with various concentration ratios are combined in a single-crystal phase. While such compositions can be represented with the help of partial site occupations, the configurational complexity becomes a severe challenge for simulation methods that require structural models with integer site occupations, such as commonly used density functional theory (DFT)^17,18. The problem of determining reasonable atomistic configurations out of all possible configurations therefore constitutes a serious challenge for modelling and simulation^{19,20,21,22,23,24}. To represent complex compositions with integer occupations the so-called supercell approach is frequently employed, where multiple periodic images of the unit cell are treated explicitly. For computational studies it is often of interest to determine low(est)-energy atomistic configurations which can be a hard combinatorial problem for large supercells. For complex compositions it is generally infeasible to evaluate all possible configurations (even when accounting for symmetry), especially when using high-level methods such as DFT. Therefore, special techniques such as the Coherent Potential Approximation (CPA)²⁵, Special Quasirandom Structure (SQS)²⁶, Cluster Expansion (CE)^27,28, Virtual Crystal Approximation (VCA)²⁹, or Small Set of Ordered Structures (SSOS)³⁰ have been developed that approximate the energy and/or are able to find special atomistic configurations that have relevant properties for further investigations. Approximations such as CE where many-particle interaction terms up to a certain order are taken into account can reduce the computational demand drastically³¹. Other approaches that try to mimic highly accurate energies at low computational costs include machine-learned potentials and/or try to reduce the amount of configurations that must be evaluated with other machine learning approaches, e.g., active learning^{32,33,34,35,36,37}.

Naturally, the number of possible configurations becomes higher if the supercell contains more sites, more positions per site, and also when more elements can occupy a site, especially when elements are mixed in equimolar amounts. All of these factors generally apply to novel energy materials and yield a combinatorial explosion of the total number of possible configurations. For highly symmetric cells, this number can be reduced by several orders of magnitude if symmetry operations are taken into account and only symmetrically irreducible configurations are considered³⁸. There are several software packages and methods such as the site-occupancy disorder (SOD) code³⁹, ENUMLIB⁴⁰ (also accessible through PYMATGEN⁴¹), the solid-solution tools^42,43 in the commercial CRYSTAL code⁴⁴, the so-called SUPERCELL software⁴⁵, the DISORDER code⁴⁶ and its recently published tree search algorithm³⁸, and the SHRY package⁴⁷ that all focus explicitly on determining symmetry in-equivalent structures. The number of available software and considerable computational effort spent highlights the importance of the atomistic combinatorial problem in computational materials research.

For ionic crystals, the Coulomb energy with ionic point charges represents a simple energy model allowing to evaluate numerous atomistic configurations with limited computational resources. In practice, the model requires the assignment of the ion valencies and the electrostatic energy is calculated by Ewald summation⁴⁸ to obtain the exact Coulomb energy of the periodic lattice⁴⁹. This allows to consider plenty of atomistic configurations explicitly and, in some cases, even the complete enumeration of all possible configurations for practical simulation supercells. This full enumeration approach, sometimes also referred to as brute force method or exhaustive sampling, is implemented with Coulomb energy evaluation in the so-called SUPERCELL software⁴⁵. More recently, the EWALDSOLIDSOLUTION software⁵⁰ was released offering the brute-force approach with an option for sparser sampling of the density of states based on Coulomb energy evaluation. In addition, EWALDSOLIDSOLUTION also features a post-processing gradient-descent-like algorithm for optimizing atomistic configurations. However, treating complex combinatorial problems as they appear in modern energy materials by brute forcing is computationally very demanding, even for simple Coulomb energy evaluation. Therefore, classical optimization approaches and the use of heuristics is commonly required.

The atomistic combinatorial sampling can be considered as a general optimization problem and commonly used meta heuristics can be applied. Do Lee et al.⁵¹ applied some well-known heuristics, including genetic algorithms, particle swarm optimization, harmony search, cuckoo search, bayesian optimization, and deep Q-networks, to configurational optimization in argyrodite utilizing Coulomb energies. Out of the vast amount of meta heuristics especially the Genetic Algorithm (GA)⁵² should be mentioned that is known to be effective for the atomistic combinatorial problem^51,53, as well as for global optimization of complex chemical structures in general⁵⁴. Next to these classical approaches, more physically motivated approaches such as Monte Carlo (MC)⁵⁵ simulations were also shown to be efficient in approaching the atomistic configurations problem^32,56,57,58, with the respective Monte Carlo methods implemented, e.g., for determination of SQS in the MCSQS code⁵⁹ as part of the ALLOY THEORETIC AUTOMATED TOOLKIT (ATAT)⁶⁰ or for general cluster expansions within the recently released STATISTICAL MECHANICS ON LATTICES package⁶¹. Binninger et al.⁵⁶ recently also demonstrated that the configuration problem can be solved on existing quantum-computing hardware by formulating it as a binary optimization problem that can be mapped onto a quantum annealer.

The aforementioned software and approaches for determining lowest energy atomistic configurations are either effectively or explicitly limited in the size of the configurational space^{39,41,42,43,45,51,56,61} or do not specifically aim to determine the low(est) Coulomb energy structures by optimization^{38,40,46,50,59}. As modern high-performance materials introduce more and more species, approaches are required that can reliably and quickly optimize even large combinatorial problems comprising of ten to the power of several hundreds of configurations. For that purpose, either heuristics or general-purpose optimization software can be used while the latter one bears the opportunity for exact global optimization within limited computational resources. Even though some works already employed heuristic optimization methods to the configuration problem, as discussed before, there is still, to the best of our knowledge, no published tool that allows for optimization of such complex problems yet. Efficient energy evaluation methods, even faster than the commonly applied Ewald summation, along with specifically tailored heuristics must be employed to achieve optimization in difficult atomistic combinatorial problems within reasonable computation time. Creating optimized atomistic configurations for complex problems in a high-throughput manner allows for efficient structure pre-selection for computational studies, such as DFT calculations, of novel materials and thereby offers the opportunity to enhance computational materials discovery in several important research fields.

In this work, we therefore approach the atomistic combinatorial problem in novel energy materials as an optimization problem utilizing a basic but reformulated Coulomb energy model. We present a Python-based code, termed GOAC (Global Optimization of Atomistic Configurations by Coulomb), that enables to interface any configuration problem of ions with distinctive valancies given as a crystallographic information file (CIF)⁶² to existing (free or commercial) optimization software. CIFs are read with help of the PYMATGEN⁴¹ package. Moreover, we introduce several Fortran-based routines that can be called from the Python code to apply various heuristics to the configurational optimization problem, including GA and MC. To provide a highly efficient implementation, the Coulombic energy is expressed by a binary optimization problem and the optimization heuristics are parallelized using OpenMP⁶³. The methodological details of the implementations and the capabilities of the GOAC code are discussed in the next section, followed by a discussion of the results and benchmarking to alternative methods.

Results

Implementation and theoretical background

A supercell is assumed comprising S sites with partial occupations and each site having P_s positions within the cell. Moreover, a site should be occupied by N_s,e ions of the element e while in total E_s elements can occupy the given site s. The total number of possible configurations C in the supercell, without considering any symmetries, is then given by:

$$C=\mathop{\prod }\limits_{s=1}^{S}\frac{{P}_{s}!}{\mathop{\prod }\nolimits_{e = 1}^{{E}_{s}}{N}_{s,e}!}.$$

(1)

For a given problem, the Global Optimization of Atomistic Configurations by Coulomb (GOAC) code aims to determine low(est) energy atomistic configuration(s) out of all possible configurations by employing various optimization techniques. To this end, GOAC offers a command line interface to provide a CIF with partial occupations and assumed charge states (valencies) for the different ions. The general workflow of GOAC is sketched in Fig. 1.

In a first step, GOAC calculates the required pairwise Ewald energy matrix-elements, which is discussed in the next section. Then, a binary optimization problem is constructed by expansion to site-specific terms that can be either interfaced to external optimizers, e.g., the GUROBI solver⁶⁴, or solved by internal Fortran heuristics. Both approaches are discussed in the following sections. Finally, the n lowest energy atomistic configurations are outputted as a CIF along with the respective Coulomb energies. It should be noted that, in its current implementation, GOAC is not able to identify symmetry-equivalent structures and all optimizers run on the full configurational space. However, filtering by energy is possible to only include structures that are different in energy, which can be useful for many problems but might exclude symmetry in-equivalent structures in some problems.

Pre-calculation of Coulomb energy terms

As optimization methods generally require evaluating the energy of many different atomistic configurations, GOAC implements an ionic Coulomb energy model due to the low computational demand. Naturally, such simple point charge models cannot account for quantum mechanical effects and there is no guarantee that the order of different ionic configurations by Coulomb energy is aligned with the one obtained by more accurate calculations, e.g., based on DFT. However, several studies showed that structures with a low Coulomb energy are often also good candidates for low DFT energies^{50,51,56,58,65}. As an example, a satisfactory correlation between DFT and Coulomb energies at randomly selected configurations is shown in Fig. 2 for ionic configurations in the layered oxide Na[Li_0.33Mn_0.67]O₂ (assumed ionic charges: Na: +1; Li: +1; Mn: +4; O: −2) that was synthesized by Wang et al.⁶⁶. The relative energies show a strong correlation between DFT and Coulomb models and the linear fit well matches the diagonal representing perfect correlation. A commonly employed approach therefore consists in pre-selecting a certain number of low Coulomb energy structures to be used for more accurate DFT calculations and eventually determine low DFT energy configurations^{65,67,68,69,70}.

**Fig. 2: Correlation between relative DFT (details are described in the Method section) and Coulomb energies of different ionic configurations for Na[Li_0.33Mn_0.67]O₂.**

Following this approach, GOAC utilizes point-charge Coulomb energies and expands them into a binary optimization model with site coefficients up to the second order. We note that for the specific case of the point-charge Coulomb energy this expansion is exact due to the pairwise character of Coulomb point-charge interactions. This allows for an efficient evaluation of different atomistic configurations during optimization as the energy can be expressed as a sum of pre-calculated coefficients. In periodic systems, Coulomb energies are, however, difficult to converge and the Ewald summation technique is required for the energy calculation.

The procedure of expressing the atomistic combinatorial problem as a binary optimization problem is sketched in Fig. 3. The total energy (E_tot) of a given atomistic configuration can be expressed as a sum of the energy of the fixed ions (zero-order term, E_const), the interaction of each placed iterative ion with the fixed ions as well as its self-interaction due to periodic boundary conditions (first-order term, α), and all particle-particle interactions between all placed iterative sites (second-order term, β). All interactions in the resulting binary optimization model can be pre-calculated for efficient energy evaluation during optimization. In order to do this, the elements of the pairwise interaction matrix of the Ewald energy x^total can be calculated by⁷¹:

$${x}_{ij}^{\,{\text{real}}\,}={q}_{i}{q}_{j}\sum _{{\bf{L}}}\frac{\,{\text{erfc}}\,(\eta \cdot {d}_{ij})}{{d}_{ij}}$$

(2)

$${x}_{ij}^{\,\text{recip}\,}=\frac{{q}_{i}{q}_{j}}{\pi V}\sum _{{\bf{k}}}\frac{\exp \left(\frac{-{\left\vert \dot{k}\right\vert }^{2}}{4{\eta }^{2}}\right)}{{\left\vert \dot{k}\right\vert }^{2}}\cdot \cos \left(\dot{k}\left({\dot{r}}_{i}-{\dot{r}}_{j}\right)\right)$$

(3)

$${x}_{ii}^{\,\text{self}\,}=\frac{-{q}_{i}^{2}\eta }{\sqrt{\pi }}$$

(4)

$${x}_{ij}^{\,\text{total}}={x}_{ij}^{{\rm{real}}}+{x}_{ij}^{{\rm{recip}}}+{x}_{ij}^{\text{self}\,}$$

(5)

**Fig. 3: The energy calculation approach.**

In these equations i and j are the indices of two sites, $\dot{r}$ is their position, q their charge, and d_ij the Euclidean distance between them. The cell volume is denoted as V, L is the sum over all real-space lattice vectors and k over (non-zero) reciprocal-space lattice vectors within the respective cut-off radii and η is the screening length. While the theory and implementation of Ewald summation is already extensively discussed in the literature, for example by Faber et al.⁷¹, we want to highlight that for the energy calculation of configurational optimization problems, the real-space and reciprocal-space terms can be split into an charge-dependent (q-dependent) and position-dependent (r-dependent or d-depended) term. The computationally demanding parts are in the position-depended expressions as the sum over all real-space (L) and reciprocal-space (k) lattice vectors has to be considered. As the pre-calculation of all pairwise interactions of a configurational optimization problem requires to evaluate multiple charges on fixed positions, the position-dependent terms of the real- and reciprocal-parts only have to be considered once for each site-pair. This can result in an additional speed-up compared to standard Ewald summations of different configurations as not just every pairwise interaction is only considered once, but also the computationally demanding summations over lattice vectors are only performed once for each pair of different positions. GOACs implementation to calculate the pairwise interaction Ewald summation energy matrix for configurational optimization problems utilizes this shortcut and in addition parallelizes the calculation of the matrix elements. From the Ewald summation matrix it is straight forward to construct the binary optimization problem by summing up the matrix elements that correspond to the black arrows in Fig. 3 to obtain the values for E_const and all expansion coefficients α and β. We note, that this expansion can be considered as special case of a general second-order cluster expansion without the requirements for any distance cut-offs as periodic pair-wise interactions are considered exactly by Ewald summation. Thus, cutting the expansion at the second interaction order yields the exact Coulomb energy of a configuration.

GOAC also allows to consider Gaussian smeared charges instead of point-charges by applying to following correction to the point-charge energy-terms⁷²:

$${x}_{ij}={x}_{ij}^{Point}-\frac{1}{2}{q}_{i}{q}_{j}\sum _{{\bf{L}}}\frac{\,{\text{erfc}}\,\left(\frac{{\sigma }_{i}{\sigma }_{j}}{\sqrt{{\sigma }_{i}^{2}{\sigma }_{j}^{2}}}{d}_{ij}\right)}{{d}_{ij}}.$$

(6)

In this equation σ is related to the smearing width $\hat{\sigma }$ of the Gaussian shaped charge by $1/(\sqrt{2}\sigma )$. It should be noted that no correction to the self-energy is applied to ensure a convergence towards the point-charge energy for $\hat{\sigma }\to 0$. This does practically also do not influence the configuration search as the self-energy cancels out when two different configurations are compared.

For the exemplary problem in Fig. 3 with two sites that are both occupied by 50% by two different species, all possible configurations can be expressed by a binary solution vector x that has a position for each site for each species. A possible solution would than have a 1 on every position where a species is placed and a 0 everywhere else. By that, the total energy of a given instance becomes a simple sum of products of pre-calculated first-order (α) and second-order (β) coefficients and the binary solution vector x. To ensure that only second-order terms are counted where both ions are placed, the β-coefficients are multiplied by the two corresponding positions in the binary solution vector. Due to the pairwise character of Coloumb energies such an expansion to a binary optimization problem is able to give the correct periodic energy for each configuration by pre-calculated coefficients.

For implementing the binary optimization problem, a slight reformulation of the equation in Fig. 3 appears to be practical where the solution vector x has two dimensions, one for the site-species (i) and one for the positions this site-species can occupy (j). Consequently, the expansion coefficients α and β become higher in dimensionality as well. By reformulation of the sums it is ensured that each interaction is only counted in one direction and just one half of the diagonal α and β matrices must be stored. Lastly, for a full optimization problem the constraints have to be defined. Beyond the binary constraint for the x variables (Eq. (10)) it must be also ensured by additional constraints that the desired total occupancy (O_i) is matched for each site-species i (Eq. (8)) and that a certain position j is not occupied by multiple species i (Eq. (9)). In summary, the optimization problem of atomistic configurations is implemented in GOAC as shown in Eqs. (7–10).

$$\mathop{\min }\limits_{{E}_{tot}}\quad {E}_{tot}={E}_{const}+\mathop{\sum }\limits_{i=1}^{S}\mathop{\sum }\limits_{j=1}^{{P}_{i}}{\alpha }_{i,j}\cdot {x}_{i,j}+$$

(7)

$$\begin{array}{l}\mathop{\sum }\limits_{i=1}^{S}\mathop{\sum }\limits_{j=1}^{{P}_{i}}\mathop{\sum }\limits_{l=j+1}^{{P}_{i}}{\beta }_{i,j,i,l}\cdot ({x}_{i,j}\cdot {x}_{i,l})+\\ \mathop{\sum }\limits_{i=1}^{S}\mathop{\sum }\limits_{j=1}^{{P}_{i}}\mathop{\sum }\limits_{k=i+1}^{S}\mathop{\sum }\limits_{l=1}^{{P}_{k}}{\beta }_{i,j,k,l}\cdot ({x}_{i,j}\cdot {x}_{k,l})\end{array}$$

subject to:

$$\mathop{\sum }\limits_{j=1}^{{P}_{i}}{x}_{i,j}={O}_{i}\quad \,\forall i\in S$$

(8)

$$\mathop{\sum }\limits_{i=1}^{S}{x}_{i,j}\le 1\quad \quad \forall j\in {P}_{i}$$

(9)

$${x}_{i,j}\in \{0;1\}\quad \quad \forall i\in S;\quad \forall j\in {P}_{i}$$

(10)

Even though Coulomb (Ewald summation) calculations are computationally comparably inexpensive, for high-throughput evaluations of atomistic configurations Eqs. (7–10) represent a significant speed-up compared to a full Ewald summation for each atomistic configuration. Moreover, by storing the expansion coefficients (α and β), the pre-calculated energy terms conveniently allow to test multiple optimization approaches without performing energy calculations every time.

Optimization strategies for atomistic configurations

Two main categories of optimizers, namely exact and heuristic optimizers, can be distinguished. A successful run of an exact optimizer guarantees that the global optimum is found or, if specified, not just the global optimum but the n lowest energy structures while n can be freely chosen by the user. The heuristic optimizers guarantee to output a valid, low energy structure that might be the global minimum or just a local minimum or no minimum at all, depending on the optimizer. The focus of heuristics is to create valid, high-quality solutions fast, while exact optimizers spend significant effort on proving optimality without improving the actual minimum solution. Depending on the needs of the user, both approaches can be valuable and are accessible via the GOAC code as described in the next sections.

Interfacing to external exact optimizers

Generally speaking, Eqs. (7–10) describe a so-called mixed integer non-linear programming (MINLP) problem with the special circumstance that all variables are not just integer but binary variables which technically allows for a reformulation to a mixed integer linear programming (MILP) problem. Problems of the same type frequently appear in the context of business economics under the collective term Operations Research, where the aim is, e.g., to determine the optimal (shortest/fastest) delivery route⁷³ or to optimize production planning⁷⁴. Due to the economic value connected to this problem type plenty of optimizers exist⁷⁵. Their aim is to find the global optimum and also prove that the global optimum was found employing advanced mathematical strategies that can be faster than a full enumeration of all possible solutions (brute forcing), which, by definition, is also an exact optimization method.

For a given atomistic combinatorial problem, GOAC can create a standard MINLP with the help of the licensed Gurobi⁶⁴ software and the full problem statement is written to a standard MPS (Mathematical Programming System) file. By default, GOAC passes this MINLP also to Gurobi for solving, however, it should be noted that the MPS file can be used to run the problem in other (commercial or free) optimization software. GOAC supports interfacing to the Gurobi optimizer and its solver parameters. It is worth noting that Gurobi (and other software) is technically capable of linearizing the quadratic terms in the MINLP to an MILP due to the binary character of the integer variables. This is not done by default in GOAC but was found to be efficient for some problems. Such a reformulation can also allow the use of other standard optimization software that are not capable of general MINLPs. However, results for exact optimizations presented in this work were obtained with the default Gurobi parameter set in GOAC, which was found to be most robust for different configuration problems. It should be noted that the MPS file of the problem can be also handed to non-exact heuristic solvers.

Internal Fortran heuristics in GOAC

The core of the GOAC code offers different heuristic optimizers for the atomistic combinatorial problem that are all tailored for this specific problem and implemented in Fortran. All of these heuristics are capable of generating valid low energy structures. The following methods are currently supported in the GOAC code: a random structure generator, a Greedy Heuristic, a Gradient Descent algorithm (GD), a Metropolis Monte Carlo code (MC)⁵⁵, a simulated annealing extension of the MC code (SA), a Replica Exchange Monte Carlo scheme (REMC)⁷⁶, and a Genetic Algorithm (GA)⁵² with roulette wheel selection⁷⁷. The random structure generation occupies sites randomly and resulting structures are not as random as structures obtained by, e.g., SQS. It is also possible to combine some of the aforementioned heuristics to a hybrid approach. Such combinations were already proposed and proven successful for chemical optimization problems^78,79 and a combination of the REMC and GA heuristic is benchmarked and referred to as HY in the following. The functionalities of the different algorithms are discussed in more detail in the manual and the code can be directly accessed within the project repository (see Code Availability Statement).

Most heuristics that are directly implemented in the GOAC code are of stochastic nature and it can be useful to run the same heuristic multiple times. By that procedure, the probability and confidence that the global minimum and other low energy structures are found can be increased. For convenience, GOAC allows to run the same heuristic multiple times in parallel with the help of OpenMP⁶³ which allows to achieve an statistic ensemble over multiple runs with the same heuristic. Moreover, trivial parallelizations such as, e.g., parallelization over the different temperatures in REMC are also implemented via OpenMP in GOAC to further boost the performance of the code. The scaling behaviour of the different algorithms is also sketched in Supplementary Fig. 1. Finally, the internal heuristics in GOAC offer abortion by run time or heuristic steps without improvement on the global minimum. More detailed descriptions of GOAC’s features and how to employ them can be found in the manual inside the project repository (see Code Availability Statement).

Performance of exact optimization methods

As explained above, GOAC has the possibility to interface to external optimization software for exact optimization of atomistic configurations. For this benchmark, the Gurobi optimizer, which utilizes an advanced branch-and-cut method, is employed with the default parameter set GOAC uses to interface to Gurobi. This parameter set enforces strong pre-solving of the model (Presolve = 2) along with a focus on proving optimality (MIPFocus = 2). It also ensures that the n lowest energy structures are found by setting the convergence boundaries to zero (MIPGap = 0 and MIPGapAbs = 0) and the “PoolSearchMode” to 2. To the best of our knowledge, the existing software for exact optimization of configurations, i.e., including proof of optimality, employ the full enumeration approach. An efficient implementation of the latter can be found in the SUPERCELL software, which is used as a reference for timings of full enumeration. Here it should be noted that the SUPERCELL software only considers the symmetry in-equivalent structures which reduces the number of explicitly considered atomistic configurations drastically compared to the total number of configurations when ignoring symmetry.

The SUPERCELL software and the optimization with Gurobi of the model prepared by GOAC were tested on a layered-oxide sodium-ion-battery cathode material (Na[Li_1/3Mn_2/3]O₂)⁶⁶ with one layer in the c-direction and partial occupations in both the transition-metal and sodium-ion sites, cf. Fig. 4. By changing the sodium-ion stoichiometry from 1.0 to 0.52, configuration combinatorics with steadily increasing number of total possible configurations ranging from ca. 10⁷ to 10¹⁴ were created and evaluated by both approaches. Such variation of the sodium concentration is also a practical example as it is a common task of battery material simulations to find sodium configurations at various concentrations that are suitably low in energy to predict accurate operation voltages⁸⁰. Charge states of Na⁺, Li⁺, Mn⁴⁺ and a variable oxidation state of oxygen ranging from −2.0 (for a Na stoichiometry of 1) to −1.75926 (for a Na stoichiometry of 0.52) to ensure charge-neutrality were assumed as the compound is reported to show anionic redox from O²⁻ to Oⁿ⁻ (n < 2)⁶⁶. As both approaches guarantee to determine the global optimum after a successful run, it is only of interest to benchmark the run time of both methods. The timings on 128 physical processor cores are plotted against the total number of configurations in Fig. 4. For smaller problem instances with up to ca. 10⁹ configurations, full enumeration was faster than optimization due to the overhead of interfacing to an external optimization code combined with the capability of the SUPERCELL software to reduce the solution space to just symmetry in-equivalent structures. However, it should be noted that timings on these problem instances are well below 10 s and therefore computationally inexpensive in both approaches. For more complex problems the full enumeration approach scales perfectly linearly while run time of the branch-and-cut optimization method increased more irregularly from the small offset caused by the overhead. In general, the computation time of the branch-and-cut optimization was significantly lower for more complex instances of this problem and also appeared to scale lower towards problems with many configurations. Overall, a speed-up of up to three orders of magnitude was achieved by the optimization with Gurobi compared to full enumeration with the SUPERCELL software at the most difficult considered problem instance with ca. 10¹⁴ total configurations. The respective run times to find the global optimum atomistic configuration in Coulomb energy were ca. 18 h by full enumeration versus ca. 1.5 min by Gurobi optimization.

**Fig. 4: Scaling of full enumeration versus optimization.**

Figure 4 clearly highlights the computational advantages that can be accessed by using GOAC to formulate a general optimization problem for the combinatorial ground-state search that can be handed to external optimization software. However, extrapolating the scaling behaviour to much larger problems also reveals that even with the significant speed up achieved, still only problems of intermediate difficulty/size can be tackled. It must be also noted that the actual performance of the branch-and-cut optimization is very much problem dependent. By introducing (slight) changes to the presented problem (e.g., more species per site or more sodium sites by using a P-type layered structure⁸¹), problems can be constructed where optimization of the complete configuration space is even slower than full enumeration of symmetry in-equivalent structures or problems that formally have as many as 10²³⁰ configurations, but are being optimized within seconds, might be obtained.

In summary, the performance of applying standard optimization software to the atomistic combinatorial problem is strongly problem (material) dependent. However, our results indicate that especially for problems of intermediate difficulty (ca. 10¹⁰ to 10²⁰ possible configurations), such as configuration of charge carriers in rechargeable energy storage materials, optimization can give a significant computational advantage over full enumeration approaches, even if the full enumeration method accounts for symmetry equivalents.

Benchmark of heuristics in GOAC

As the heuristics do not guarantee to find the global minimum, a suitable benchmark could either compare the lowest energy that is found within given computational resources or the time that it takes to find a known global optimum. However, the implemented heuristics are of stochastic nature which makes it important to average their performance over multiple runs. Such comparisons of the different internal heuristics in GOAC are discussed for several examples with various complexity in the following. Moreover, an additional benchmark of FeSbO₄ is shown in the supplementary information (Supplementary Table 1). All examples in the following were executed on the same hardware and run times (given in real time) were estimated by the CPU time required to perform each calculation.

Atomistic configurations in NaCl

The site occupation in NaCl is not a true combinatorics problem as the unit cell contains two distinctive sites, one for Na and one for Cl. However, for testing purposes both sites can be modified such that each site is occupied by 50% of each species, yielding an atomistic combinatorial problem. With this model, in a 3 × 3 × 3-supercell the total number of possible configurations is ca. 10⁶⁴, a rather difficult combinatorial problem. As the global optimum still remains trivial, a perfectly alternating pattern of Na and Cl in all dimensions, this problem statement is a rather suitable benchmark. Moreover, calculation of the Madelung constant⁸²,

$${M}_{C}=\frac{4\pi \times {\epsilon }_{0}\times r\times | E| }{{N}_{{\rm{Ions}}}/2\times e},$$

(11)

is straight forward and convergence to the literature value of M = 1.74756…⁸³ can be tracked for the different heuristics over run time. In this equation, ϵ₀ is the electric constant, r the lattice distance of two neighbouring sites (2.81 Å), E the Coulomb energy of the considered structure, N_ions the total number of ions in the structure (216), and e the elementary charge.

The convergence towards the Madelung constant for the heuristics implemented in GOAC is plotted in Fig. 5. It is observed that the Gradient Descent heuristic requires some time before the first solutions can be obtained. In this algorithm, the first solution is written as soon as the local minimization from a random starting point is finished and then the next random starting structure is optimized. The time required to reach this first solution is also different for random starting structures as different amounts of optimization steps are necessary to reach a local minimum. Therefore, in the beginning of the GD plot, averages over less than 16 runs are contained, which also explains the drop in the average caused by more independent runs that obtained their first solution being included. Even though the average becomes flatter and standard deviations as well as min-max differences become smaller towards the end of the 5 minutes run time, no run was able to find the global minimum. This highlights the problem of this algorithm as it guarantees to find a local minimum but on shallow energy surfaces with many local minima it becomes highly unlikely to find the global minimum as there is a high chance to get trapped in another local minimum. However, in other use cases one might also be interested in studying these local minima.

**Fig. 5: Heuristic optimization in NaCl.**

The same tendency can be observed in Fig. 5 for Monte Carlo performed at low temperatures (ca. 0.1–0.5 eV) as the average quickly flattens to a constant value since the algorithm gets trapped in local minima at a low sampling temperature, similar to the outcome of the GD method. At higher temperatures (ca. 0.6–0.8 eV), however, the averages are observed to get closer to the Madelung constant (corresponding to the global energy minimum) over time as local barriers can be passed with a certain probability to eventually find lower minima. At high temperatures (ca. 0.9–1.0 eV) the algorithm is able to pass even higher energy barriers, thus spending only short times for local optimization and resulting in a decrease of average performance. In this example, the best result (on average) was obtained at a temperature of 0.8 eV and the performance was quite sensitive to the simulation temperature, even though multiple runs at various temperatures were able to find the global optimum within five minutes of run time. To overcome these temperature sensitivity, methods that make use of temperature variation to improve the optimization performance are discussed next.

The average performance of Simulated Annealing was similar to that of MC at lower to intermediate temperatures with a relatively high variance in solutions, as some runs returned the global optimum. This behaviour can be explained by the rather fast cooling rate chosen, which exponentially decreased from an initial simulation temperature of 1.0 eV to almost 0 eV during the run time (cf. inset in Fig. 5). Such a high cooling rate, which was required to scan a sufficiently large temperature range within the given run time limit, makes it more unlikely that a sufficient temperature is present at the crucial optimization steps leading to a high risk of local minima trapping. Nevertheless, SA was able to find the global optimum in some runs.

The last tested approach from the MC family, namely Replica Exchange Monte Carlo, shows a better performance than SA. The algorithm showed a pronounced optimization, especially in the first ca. 100 s, before almost constant values for average, standard deviation, and min-max were reached. This behaviour indicates that the optimization got trapped in local minima for some runs, while in other runs the global optimum was successfully reached. As only about one third of the run time (ca. the first 100 s) was effectively used for optimization, the performance might be improved by using more than four temperatures in REMC, including also largely-different and higher temperatures. Compared to the other heuristics, REMC performed very well within the given run time.

Among the approaches compared in Fig. 5, the Genetic Algorithm shows the slowest increase in average performance versus run time. Several generations and selection procedures are required to obtain more optimized structures resulting in the steep improvement of average energy. Even though some GA runs successfully reached the global optimum, the average over all runs was still substantially below the correct Madelung constant after 300 s of run time, showing that some runs got trapped in local minima. The trapping also goes along with high standard deviations and a large min-max difference. This occurs if the structural variation in the generations becomes low and centred around a deep local minimum. Another problem can be that the generation consists of symmetry equivalents of the same local minimum or if the local minimum is so deep that it can not be exited at small mutation rates which are required for a systematic optimization.

To overcome these limitations the Hybrid approach can be employed which provided the best performance among the methods compared in Fig. 5. Here, a pre-trained (from REMC) generation was used for the GA which greatly improved the average performance within the first seconds of the run. Moreover, the REMC steps between the GA runs can help to improve the variation in the generation pool of the GA. Vice versa, the GA offers a systematic procedure to make rather large steps on the potential energy surface that cannot be efficiently achieved by pure REMC. Therefore, both approaches can complement each other and the results demonstrate that HY was very effective with the average of 16 independent runs being fairly close to the correct Madelung constant after just 5 minutes of run time and with many runs ending in the global optimum. Moreover, the average kept increasing at longer run times indicating that most of the runs would eventually converge to the global optimum. Notably, the HY strategy performed better than the two individual approaches (GA and REMC) and was the best out of all investigated methods, indicating that a beneficial synergy effect between GA and REMC was achieved.

Li-site occupation and Ta doping in LLZO

Li₇La₃Zr₂O₁₂ (LLZO) is a widely studied electrolyte for all-solid-state batteries and therefore of high practical interest. However, the global minimum energy structure or in general low energy configurations are rather hard to approach computationally due to its large unit cell (8 formula units). The computational challenge becomes even more severe when dopants and defects are introduced that require even larger supercells. For these cases, the configurational space is extremely large, representing an interesting test for GOAC to obtain optimized atomistic configurations in terms of Coulomb energies. As an example, we consider Li₆La_2.969Zr_0.906Ta_1.094O₁₂ (Charges: Li¹⁺, La³⁺, Zr⁴⁺, Ta⁵⁺, O²⁻) which can be modelled by a 2 × 2 × 1 supercell (32 formula units). The modelled composition is in good agreement with the experimental one reported by Redhammer et al.⁸⁴. We define the structure such that all lithium ions can be placed in both the tetrahedral and octahedral sites, resulting in a total of ca. 10¹⁵⁹ possible atomistic configurations. The corresponding structure model is also shown in Supplementary Fig. 2.

Performances over 10 independent optimization runs are visualized for each heuristic of GOAC in Fig. 6. As discussed previously at the example of NaCl, the GD algorithm requires some time before the first local optimizations are finished and therefore the average plot begins at ca. 1000 s in Fig. 6. The overall performance of GD was found to be among the worst out of the GOAC heuristics. The GA solutions converged to a similar average energy as GD, but also had the largest variation between the best and worst independent runs, hinting at local minima trapping. This behaviour might be reasoned by the different parallelization approaches as discussed in the supplementary information and the code documentation in the project repository (see Code Availability statement). However, averages shown in Fig. 6 are still a fair comparison of optimization performance versus CPU time, revealing that the heuristics including some sort of MC are more efficient than a pure GA for LLZO.

**Fig. 6: Heuristic optimization in LLZO.**

The MC approach returned an intermediate average energy per ion, while the SA and REMC methods yielded significantly lower energies after one hour of run time. For most heuristics the convergence was rather flat beyond the first ca. 500 s, but SA showed an exponential decrease over the whole run time which was matching the exponential decrease of the respective simulation temperature from ca. 12,000 K to almost 0 K (cf. Fig. 6). Interestingly, also the variance between the best and worst runs became rather small for the SA approach. In contrast to the results obtained for NaCl, SA performed well for the present example due to the longer run times that allowed for a slower cooling rate. The final average energies obtained from SA and REMC were similar, but the energy of the best REMC run was slightly lower than that of the best SA run, and the corresponding minimum-energy structure is shown in Fig. 6. The superior performances of SA and REMC over the other methods demonstrate that MC approaches with some temperature variation are very effective for the complex LLZO configuration problem. While in this example, the HY approach was not able to improve on the performance of REMC, still a much lower average energy than for the pure GA was found. The overall performance of HY might be increased by longer run times and adjusted heuristic parameters.

The determined minimum energy structure can be analysed in terms of the ratio of lithium ions in tetrahedral versus octahedral coordination of oxygen as all lithium ions were freely iterated over both classes of sites during optimization. A ratio of $\frac{77}{115}\approx 0.67$ is obtained which is in very good agreement with the ratios of 0.74, 0.64, and 0.59 (after different treatments) and an average of 0.66 reported from experiments⁸⁴. This highlights again the predictive quality of point-charge Coulomb energies for the configuration of ions in complex structures and validates the approach of pre-selecting atomistic low energy configurations by Coulomb energies for higher-level calculations. It should be mentioned that in a practical study one might be interested in the n lowest energy configurations as the material probably encounters some disorder in experiment. However, referencing to the lowest energy configuration is desirable to assess which meta-stable configurations might exist at a given temperature. Moreover, the discussed LLZO example can hardly be approached by exact optimization or computationally more demanding energy evaluation models proving the practicability of heuristic optimization with Coulomb energies. To the best of our knowledge, heuristic configurational optimization of Coulomb energies has not been reported before for any comparably complex atomistic combinatorial problem. However, it should be mentioned that heuristic optimization was carried out on different, more complex problem settings beyond site-configurational optimization and Coulomb energies such as, e.g., protein folding⁸⁵.

Layered oxide cathode materials

To further demonstrate the optimization capabilities of GOAC, we addressed the atomistic combinatorial problem in a high-entropy layered sodium-ion-battery cathode material. The composition of O3-Na_2/3[Li_1/6Fe_1/6Co_1/6Ni_1/6Mn_1/3]O₂ was recently proposed by Yao et al.⁸⁶, while O3 indicates that the structure has three layers in the c-direction and octahedral coordination of the sodium ions⁸¹. We modelled the system in a $\sqrt{3}$-unit cell (a = 5.0 Å, c = 19.2 Å) assuming ionic charges of Na⁺, Li⁺, Fe^2.5+, Co^3.5+, Ni²⁺, Mn⁴⁺, and O^1.75−. The cationic charges were chosen to agree with the ones observed in experiment⁸⁶ while the charge of Fe was decreased by 0.5 and the one of Co was increased by 0.5 to ensure that all configurations are distinguishable in Coulomb energy. The oxygen charge was set to achieve a overall charge-neutral compound and can be reasoned by the experimentally reported oxygen redox. All sodium ions were iterated over all sodium positions in every layer (one sodium site in the whole structure with nine positions in the unit cell) and all ions in the transition metal layers were iterated over all positions in each layer (one transition metal site in the whole structure with nine positions in the unit cell), allowing for the maximal configurational space. To highlight the scalability and limitations of GOAC, this configuration problem was solved in supercells of different sizes ranging from 4 unit cells (2 × 2 × 1, Na₂₄[Li₆Fe₆Co₆Ni₆Mn₁₂]O₇₂) to 108 unit cells (6 × 6 × 3, Na₆₄₈[Li₁₆₂Fe₁₆₂Co₁₆₂Ni₁₆₂Mn₃₂₄]O₁₉₄₄). Structure models for the smallest and largest considered supercells are visualized in Supplementary Fig. 4. Results for optimizing the atomistic configurations with the heuristics in GOAC within a given run time (given computational resources) are summarized in Table 1.

Table 1 Energies per ion of the lowest energy structures obtained for differently sized supercells of O3-Na_2/3[Li_1/6Fe_1/6Co_1/6Ni_1/6Mn_1/3]O₂ with the different heuristics implemented in GOAC (all calculations performed on 128 physical CPU cores and using 128 OpenMP threads)

Full size table

Remarkably, all solvers were capable to find the same minimum, likely the global minimum, for the smallest problem of a 2 × 2 × 1 supercell within just one hour of run time. It should be also noted that most heuristics identified this minimum within the first minutes (cf. the convergence versus run time plots in Supplementary Figs. 5–10). Compared to the exact solvers presented in the previous section, this represents a huge speed up as a problem with 10³¹ total configurations would be (almost) impossible to solve with an exact solver in a reasonable run time, especially not within just one hour. This highlights the practicability of GOAC as problems of this size regularly appear when high-entropy structures or similarly complex structures are to be pre-selected for DFT calculations. The suitability to pre-select low(est) energy structures for DFT calculations was also checked by performing single-point DFT calculations on the 10 lowest energy configurations obtained by the REMC approach (Supplementary Fig. 11). This is particularly practical as one is usually interested in selecting a sufficiently low or several low energy configurations but in the following just the global minimum is discussed to better compare the performances of the different heuristic optimizers.

For the next larger problem, a 2 × 2 × 2 supercell, only the more advanced heuristics, namely SA, REMC, GA, and HY, were able to find the same lowest energy structure, which makes it again a likely candidate for the global minimum in Coulomb energy. The respective minimum energy is lower than the minimum energy obtained for the smaller problem, because the increased problem size allows for larger, energetically more favourable superstructures. The same applies to the 4 × 4 × 1 supercell where SA, REMC, and HY obtained the same best candidate configuration for the global minimum. As the periodicity is extended in a different direction compared to the 2 × 2 × 2 supercell, the minimum energy is still lower than for the 2 × 2 × 1 case but higher than for the 2 × 2 × 2 supercell. For a 4 × 4 × 2 supercell, only SA and REMC were capable to find a likely candidate for the global minimum. The respective minimum energy is identical to the one of the 2 × 2 × 2 problem as both consider the same periodicity, and thus same degrees of freedom, in the c-direction. The additional degrees of freedom in a and b-direction, on the other hand, did not seem to allow for the formation of lower energy superstructures. These findings highlight another aspect why it is important to consider sufficiently large supercells in the construction of structural models with occupational disorder, because suitable supercell sizes are required for lowest energy superstructures. To efficiently select suitable supercell sizes and to account for the fact that it becomes increasingly hard to obtain the lowest energy configuration in larger supercells even if it is already known from a smaller commensurately cell, GOAC also allows to systematically scan for increasing supercell sizes to find low(est) energy configurations.

For an even larger 6 × 6 × 1 supercell, the GD heuristic was not able to reach any local minimum within the given run time since more complex problems not only increase the expected number of optimization steps required to reach a local minimum from a random starting structure but also heavily increase the amount of neighbouring structures that need to be evaluated to follow the steepest descent path. Within the given framework, 10²⁶⁹ configurations seemed to be the maximum where GD could be applied within reasonable computational resources, which is arguably already a quite large configurational space. For the 6 × 6 × 1 supercell, REMC returned the lowest energy structure, lower in energy than the 2 × 2 × 1 minimum, which was expected given that the 6 × 6 × 1 is a multiple of the 2 × 2 × 1 supercell. SA also returned a low-energy solution, albeit not the same minimum, probably because the cooling rate was too fast for the given problem size and run time limitation.

For all supercells larger than 6 × 6 × 1, SA found the lowest energy structure out of all heuristics implemented in GOAC. However, the obtained minima did not correspond to the respective global minima as they were higher in energy than the minimum energy structures of one of the smaller problems with matching multiplicity. While it is still possible to run optimizations on these extremely large problems, the results show the limitations of the heuristics implemented in GOAC as one cannot expect to find lowest energy configurations within reasonable run times for such large configurational spaces. Due to the combinatorial explosion in large cells it is also not surprising that it is nearly impossible to find minimum energy structures in configurational spaces with up to 10⁹²⁰ configurations, a number even larger than the estimated total number of atoms in the entire universe⁸⁷ to the power of ten (The actual number of atoms in the universe must be estimated from measured densities and hydrogen/helium distributions and is in the range of ca. 10⁸⁰ atoms).

The pure MC heuristic performed inferior to the more elaborate SA and REMC extensions for all problem sizes. As it was shown for NaCl, the MC method is quite sensitive to the simulation temperature which was not re-optimized for every problem in the benchmark (fixed to 0.75 eV). The GA performed rather poor for problems with a complexity of 10²⁶⁹ or more in its current implementation. Combining the GA with REMC in the HY approach did not resolve this issue for the larger problem sizes as the gain in performance compared to the pure GA stemmed almost exclusively from the REMC part. Therefore, the overall performance of the HY method was still inferior to using all computational resources on REMC. More advanced HY combination schemes or different crossing strategies in the GA might resolve this under-performance in the future.

Discussion

In this work, we showed that the problem of finding low(est) energy configurations in the huge configurational space of modern energy materials can be effectively approached by using advanced optimization methods in combination with Coulomb energy models. The Coulomb energy variations between different configurations often align well with energies from higher levels of theory, e.g., DFT, and sampling by Coulomb energies is therefore an attractive method to pre-select low-energy structure candidates. As a tool for conveniently and effectively exploring the vast configurational space of atomistic configurations in complex materials, we introduced the GOAC code that can be conveniently accessed as a command line tool.

The calculation of energies of different configurations was significantly sped up by expressing the Coulomb energy cost function as an expansion to a binary optimization problem, which enables the use of pre-calculated coefficients in the optimization procedure, thus providing significant improvements over performing Ewald summations at each optimization step. This reformulation transforms the atomistic combinatorial problem statement into an MINLP problem and allows to employ various advanced optimization methods. We showed that the exact optimization of the MINLP, interfaced via GOAC to existing optimization software, was several orders of magnitude faster than the full enumeration approach often applied for the atomistic combinatorial problem, allowing to exactly solve configuration problems for larger system sizes.

Due to the combinatorial explosion of the configurational space in complex multi-element materials, exact solving strategies cannot be applied to more complex materials. For such problems, we implemented several heuristics in GOAC, including Gradient Descent, Monte Carlo, Simulated Annealing, Replica Exchange Monte Carlo, Genetic Algorithms, and hybrid approaches. With these heuristics, GOAC produced high-quality low-energy structures within limited computational resources for extremely large configuration problems, which is of interest to model complex compositions and identify possible superstructures. As a highlight, we showed that GOAC was able to find likely candidates for global minimum structures of problems with 10³⁰³ configurations in just about 2 h of run time on 128 CPU cores. It should be mentioned that this usually implies that also the n lowest energy configurations that are of interest for further computational studies are obtained as well. Moreover, it was demonstrated that for problem combinatorics up to 10⁹²⁰, it was still possible to perform optimizations using the GOAC package even though finding minimum energy configurations in reasonable computation time cannot be expected at such large problems.

For the results presented in this work, simple point-charge Coulomb energies were employed, which represent a rough estimation that does not guarantee to coincide in the lowest energy configuration with higher level of theory approaches, e.g., DFT. Moreover, atomistic combinatorial problems with charge-neutral ions (atoms) or ions with identical valencies cannot be optimized on the basis of Coulomb energies alone. In general, one can expect to get reasonable energetic alignments of DFT and Coulomb when the charges are more localized as this is more well-described by point-charges. When studying systems with more delocalized charges, e.g., highly charged cathode materials, the alignment of DFT and Coulomb energies might decrease. To potentially overcome the issue of too delocalized charges and also allow to treat different ions with the same charge, GOAC also supports Gaussian smeared charges. Future studies will show if or how smearing out the point-charges can improve the accuracy of the Coulomb model in cases with strong delocalization and help to deal with different species that have the same valency.

Finally, it should be mentioned that GOAC can perform well on several other research questions concerning configurations, also beyond the scaling tests and configurational selections shown in this work. For example, GOAC might be employed to study charge-ordering of ions that disproportionate into different valances (cf. Supplementary Fig. 12) or charge-ordering in general. In fact, results at the example of a layered oxide sodium-ion cathode material in Supplementary Fig. 13 indicate that a strong correlation of energies of charge-orderings of differently charged Mn ions exists between Coulomb and DFT energies. In the case of layered oxide cathodes GOAC optimizations also allow to study transition metal layer charge-orderings and Na-orderings in a coupled fashion to get an idea if and how theses two orderings are coupled. The examples presented in Supplementary Fig. 14 indicate that GOAC might also be successfully applied to this problem setting as DFT calculations show similar trends to the GOAC optimizations. Further studies might show in more detail how GOAC can be employed to study various types of orderings in layered oxide materials and how well the results match selected references, e.g., DFT calculations. Lastly, GOAC was recently also applied to study the single-phase — two-phase charging characteristics of lithium iron phosphate (LFP)⁸⁸. Results showed that electrostatic optimization can reproduce the critical particle size from experiment for the switch from the single-phase to the two-phase charging mechanism as well as the energetically most favourable interface orientation between the two phases. This indicates that GOAC could be used to study similar materials in the future.

In summary, GOAC can be a valuable tool for computational research on novel energy materials and other complex materials to determine likely candidate structures for low or lowest energy atomistic configurations with comparably little computational resources.

Methods

DFT reference calculations

The DFT reference calculations shown in Fig. 2 were performed with the VIENNA AB INITIO SIMULATION PACKAGE (VASP)⁸⁹ in the projector augmented wave (PAW) scheme⁹⁰ with the Perdew–Burke–Ernzerhof (PBE) exchange-correlation functional⁹¹. An energy cut-off of 520 eV along with a convergence criterion of 10⁻⁴ eV, a 1 × 1 × 2 Γ-centred k-point grid, and spin-polarization was employed. Single-point calculations without any geometry optimization were performed to allow for a fair comparison to Coulomb energies. The exact geometries can be found in the “Examples” folder of the project repository (see Data Availability statement). Structure models in this work were visualize with the VESTA software⁹².

Data availability

The underlying code, GOAC, that was designed for this study is openly available on the Forschungszentrum Jülich GitLab and can be found here: https: //iffgit.fz-juelich.de/k.koester/goac. All raw data and input files for the examples shown in this work can be found in the Forschungszentrum Jülich GitLab project in the ”Examples“ folder.

Code availability

The underlying code, GOAC, that was designed for this study is openly available on the Forschungszentrum Jülich GitLab and can be found here: https://iffgit.fz-juelich.de/k.koester/goac.

References

Chen, S. et al. Compositional dependence of structural and electronic properties of Cu₂ZnSn(S,Se)₄ alloys for thin film solar cells. Phys. Rev. B 83, 125201 (2011).
Boyd, C. C., Cheacharoen, R., Leijtens, T. & McGehee, M. D. Understanding degradation mechanisms and improving stability of perovskite photovoltaics. Chem. Rev. 119, 3418–3451 (2019).
Article CAS PubMed Google Scholar
Anantharamulu, N. et al. A wide-ranging review on nasicon type materials. J. Mater. Sci. 46, 2821–2837 (2011).
Article CAS Google Scholar
Li, W., Erickson, E. M. & Manthiram, A. High-nickel layered oxide cathodes for lithium-based automotive batteries. Nat. Energy 5, 26–34 (2020).
Article CAS Google Scholar
Lun, Z. et al. Cation-disordered rocksalt-type high-entropy cathodes for li-ion batteries. Nat. Mater. 20, 214–221 (2021).
Article CAS PubMed Google Scholar
Sarkar, A. et al. High entropy oxides for reversible energy storage. Nat. Commun. 9, 3400 (2018).
Article PubMed PubMed Central Google Scholar
Fabbri, E., Pergolesi, D. & Traversa, E. Materials challenges toward proton-conducting oxide fuel cells: a critical review. Chem. Soc. Rev. 39, 4355–4369 (2010).
Article CAS PubMed Google Scholar
Yan, S. et al. Perovskite solid-state electrolytes for lithium metal batteries. Batteries 7, 75 (2021).
Article CAS Google Scholar
Bai, X., Duan, Y., Zhuang, W., Yang, R. & Wang, J. Research progress in li-argyrodite-based solid-state electrolytes. J. Mater. Chem. A 8, 25663–25686 (2020).
Article CAS Google Scholar
Shin, D. O. et al. Synergistic multi-doping effects on the Li₇La3Zr₂O₁₂ solid electrolyte for fast lithium ion conduction. Sci. Rep. 5, 18053 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wang, C. et al. Garnet-type solid-state electrolytes: materials, interfaces, and batteries. Chem. Rev. 120, 4257–4300 (2020).
Article CAS PubMed Google Scholar
Oses, C., Toher, C. & Curtarolo, S. High-entropy ceramics. Nat. Rev. Mater. 5, 295–309 (2020).
Article CAS Google Scholar
Li, H., Lai, J., Li, Z. & Wang, L. Multi–sites electrocatalysis in high–entropy alloys. Adv. Functional Mater. 31, 2106715 (2021).
Kumar Katiyar, N., Biswas, K., Yeh, J.-W., Sharma, S. & Sekhar Tiwary, C. A perspective on the catalysis using the high entropy alloys. Nano Energy 88, 106261 (2021).
Article CAS Google Scholar
Ma, Z. et al. High entropy semiconductor AgMnGeSbTe₄ with desirable thermoelectric performance. Adv. Functional Mater. 31, 2103197 (2021).
Yarema, O., Yarema, M. & Wood, V. Tuning the composition of multicomponent semiconductor nanocrystals: the case of i–iii–vi materials. Chem. Mater. 30, 1446–1461 (2018).
Article CAS Google Scholar
He, Q., Yu, B., Li, Z. & Zhao, Y. Density functional theory for battery materials. Energy Environ. Mater. 2, 264–279 (2019).
Article CAS Google Scholar
Zhang, T., Li, D., Tao, Z. & Chen, J. Understanding electrode materials of rechargeable lithium batteries via dft calculations. Prog. Nat. Sci. Mater. Int. 23, 256–272 (2013).
Article CAS Google Scholar
Wang, Y., Yu, B., Xiao, J., Zhou, L. & Chen, M. Application of first principles computations based on density functional theory (dft) in cathode materials of sodium-ion batteries. Batteries 9, 86 (2023).
Article CAS Google Scholar
d’Avezac, M. & Zunger, A. Identifying the minimum-energy atomic configuration on a lattice: Lamarckian twist on darwinian evolution. Phys. Rev. B 78, 064102 (2008).
Islam, M. S. & Fisher, C. A. J. Lithium and sodium battery cathode materials: computational insights into voltage, diffusion and nanostructural properties. Chem. Soc. Rev. 43, 185–204 (2014).
Article CAS PubMed Google Scholar
Zhang, R.-Z. & Reece, M. J. Review of high entropy ceramics: design, synthesis, structure and properties. J. Mater. Chem. A 7, 22148–22162 (2019).
Article CAS Google Scholar
Toda-Caraballo, I., Wróbel, J. S., Nguyen-Manh, D., Pérez, P. & Rivera-Díaz-del Castillo, P. E. J. Simulation and modeling in high entropy alloys. JOM 69, 2137–2149 (2017).
Article Google Scholar
Huo, W. et al. High-entropy materials for electrocatalytic applications: a review of first principles modeling and simulations. Mater. Res. Lett. 11, 713–732 (2023).
Article CAS Google Scholar
Velický, B. Theory of electronic transport in disordered binary alloys: coherent-potential approximation. Phys. Rev. 184, 614–627 (1969).
Article Google Scholar
Wei, S., Ferreira, L. G., Bernard, J. E. & Zunger, A. Electronic properties of random alloys: special quasirandom structures. Phys. Rev. B Condens. Matter 42, 9622–9649 (1990).
Article CAS PubMed Google Scholar
Laks, D. B., Ferreira, L. G., Froyen, S. & Zunger, A. Efficient cluster expansion for substitutional systems. Phys. Rev. B Condens. Matter 46, 12587–12605 (1992).
Article CAS PubMed Google Scholar
Sanchez, J. M. Cluster expansions and the configurational energy of alloys. Phys. Rev. B Condens. Matter 48, 14013–14015 (1993).
Article CAS PubMed Google Scholar
Bellaiche, L. & Vanderbilt, D. Virtual crystal approximation revisited: application to dielectric and piezoelectric properties of perovskites. Phys. Rev. B Condens. Matter 61, 7877–7882 (2000).
Article CAS Google Scholar
Sorkin, V., Tan, T. L., Yu, Z. G. & Zhang, Y. W. Generalized small set of ordered structures method for the solid-solution phase of high-entropy alloys. Phys. Rev. B 102, 174209 (2020).
Ångqvist, M. et al. ICET – a Python library for constructing and sampling alloy cluster expansions. Adv. Theory Simul. 2, 1900015 (2019).
Kostiuchenko, T., Körmann, F., Neugebauer, J. & Shapeev, A. Impact of lattice relaxations on phase transitions in a high-entropy alloy studied by machine-learning potentials. npj Computational Mater. 5, 55 (2019).
Yuan, X. et al. Active learning to overcome exponential-wall problem for effective structure prediction of chemical-disordered materials. npj Computational Mater. 9, 12 (2023).
Ferrari, A. et al. Frontiers in atomistic simulations of high entropy alloys. J. Appl. Phys. 128, 150901 (2020).
Tetsassi Feugmo, C. G., Ryczko, K., Anand, A., Singh, C. V. & Tamblyn, I. Neural evolution structure generation: high entropy alloys. J. Chem. Phys. 155, 044102 (2021).
Article CAS PubMed Google Scholar
Peng, Q. et al. Active-learning search for unitcell structures: a case study on Mg₃Bi_2-xSb_x. Computational Mater. Sci. 226, 112260 (2023).
Article CAS Google Scholar
Yaghoobi, M. & Alaei, M. Machine learning for compositional disorder: a comparison between different descriptors and machine learning frameworks. Computational Mater. Sci. 207, 111284 (2022).
Article CAS Google Scholar
Lian, J.-C. et al. Highly efficient tree search algorithm for irreducible site-occupancy configurations. Phys. Rev. B 105, 014201 (2022).
Grau-Crespo, R., Hamad, S., Catlow, C. R. A. & de Leeuw, N. H. Symmetry-adapted configurational modelling of fractional site occupancy in solids. J. Phys. Condens. Matter 19, 256201 (2007).
Article Google Scholar
Hart, G. L. W. & Forcade, R. W. Algorithm for generating derivative structures. Phys. Rev. B 77, 224115 (2008).
Ong, S. P. et al. Python materials genomics (pymatgen): a robust, open-source python library for materials analysis. Computational Mater. Sci. 68, 314–319 (2013).
Article CAS Google Scholar
Mustapha, S. et al. On the use of symmetry in configurational analysis for the simulation of disordered solids. J. Phys. Condens. Matter 25, 105401 (2013).
Article CAS PubMed Google Scholar
D’Arco, P. et al. Symmetry and random sampling of symmetry independent configurations for the simulation of disordered solids. J. Phys. Condens. Matter 25, 355401 (2013).
Article PubMed Google Scholar
Erba, A. et al. Crystal23: a program for computational solid state physics and chemistry. J. Chem. Theory Comput. 19, 6891–6932 (2023).
Article CAS PubMed Google Scholar
Okhotnikov, K., Charpentier, T. & Cadars, S. Supercell program: a combinatorial structure-generation approach for the local-level modeling of atomic substitutions and partial occupancies in crystals. J. Cheminformatics 8, 17 (2016).
Article Google Scholar
Lian, J.-C., Wu, H.-Y., Huang, W.-Q., Hu, W. & Huang, G.-F. Algorithm for generating irreducible site-occupancy configurations. Phys. Rev. B 102, 134209 (2020).
Prayogo, G. I. et al. Shry: application of canonical augmentation to the atomic substitution problem. J. Chem. Inf. Model. 62, 2909–2915 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ewald, P. P. Die berechnung optischer und elektrostatischer gitterpotentiale. Ann. Der Phys. 369, 253–287 (1921).
Article Google Scholar
Toukmaji, A. Y. & Board, J. A. Ewald summation techniques in perspective: a survey. Comput. Phys. Commun. 95, 73–92 (1996).
Article CAS Google Scholar
Jang, S.-H., Jalem, R. & Tateyama, Y. Ewaldsolidsolution: a high-throughput application to quickly sample stable site arrangements for ionic solid solutions. J. Phys. Chem. A 127, 5734–5744 (2023).
Article CAS PubMed Google Scholar
Lee, B. D. et al. Argyrodite configuration determination for DFT and AIMD calculations using an integrated optimization strategy. RSC Adv. 12, 31156–31166 (2022).
Article CAS PubMed PubMed Central Google Scholar
Fraser, A. S. Simulation of genetic systems by automatic digital computers I. Introduction. Aust. J. Biol. Sci. 10, 484 (1957).
Article Google Scholar
Han, W. G., Park, W. B., Singh, S. P., Pyo, M. & Sohn, K.-S. Determination of possible configurations for Li_0.5CoO₂ delithiated li-ion battery cathodes via dft calculations coupled with a multi-objective non-dominated sorting genetic algorithm (NSGA-III). Phys. Chem. Chem. Phys. 20, 26405–26413 (2018).
Article CAS PubMed Google Scholar
Dieterich, J. M. & Hartke, B. OGOLEM: global cluster structure optimisation for arbitrary mixtures of flexible molecules. a multiscaling, object-oriented approach. Mol. Phys. 108, 279–291 (2010).
Article CAS Google Scholar
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1092 (1953).
Article CAS Google Scholar
Binninger, T., Ting, Y.-Y., Kowalski, P. M. & Eikerling, M. H. Optimization of ionic configurations in battery materials by quantum annealing. Phys. Rev. B 110, L180202 (2024).
Ferrari, A., Körmann, F., Asta, M. & Neugebauer, J. Simulating short-range order in compositionally complex materials. Nat. Computational Sci. 3, 221–229 (2023).
Article Google Scholar
Binninger, T., Marcolongo, A., Mottet, M., Weber, V. & Laino, T. Comparison of computational methods for the electrochemical stability window of solid-state electrolyte materials. J. Mater. Chem. A 8, 1347–1359 (2020).
Article CAS Google Scholar
van de Walle, A. et al. Efficient stochastic generation of special quasirandom structures. Calphad 42, 13–18 (2013).
Article Google Scholar
van de Walle, A., Asta, M. & Ceder, G. The alloy theoretic automated toolkit: a user guide. Calphad 26, 539–553 (2002).
Article Google Scholar
Barroso-Luque, L. et al. smol: a python package for cluster expansions and beyond. J. Open Source Softw. 7, 4504 (2022).
Article Google Scholar
Hall, S. R., Allen, F. H. & Brown, I. D. The crystallographic information file (cif): a new standard archive file for crystallography. Acta Crystallogr. Sect. A Found. Crystallogr. 47, 655–685 (1991).
Article Google Scholar
Dagum, L. & Menon, R. Openmp: an industry standard api for shared-memory programming. IEEE Computational Sci. Eng. 5, 46–55 (1998).
Article Google Scholar
Gurobi Optimization, L. L. Gurobi optimizer reference manual, https://www.gurobi.com (2023).
Moradabadi, A. & Kaghazchi, P. Defect chemistry in cubic Li_6.25Al_0.25La₃Zr₂O₁₂ solid electrolyte: a density functional theory study. Solid State Ion. 338, 74–79 (2019).
Article CAS Google Scholar
Wang, Q. et al. Unlocking anionic redox activity in O3-type sodium 3d layered oxides via li substitution. Nat. Mater. 20, 353–361 (2021).
Article CAS PubMed Google Scholar
Kim, J. C. et al. Direct observation of alternating octahedral and prismatic sodium layers in O3–type transition metal oxides. Adv. Energy Mater. 10, 2001151 (2020).
Voronina, N. et al. Unveiling the role of ruthenium in layered sodium cobaltite toward high–performance electrode enabled by anionic and cationic redox. Adv. Energy Mater. 13, 2302017 (2023).
Kim, H.-J. et al. Synergetic impact of dual substitution on anionic–cationic activity of P2-type sodium manganese oxide. Energy Storage Mater. 66, 103224 (2024).
Article Google Scholar
Pahari, D., Chowdhury, A., Das, D., Paul, T. & Puravankara, S. The evolution of structure–property relationship of P2-type Na_0.67Ni_0.33Mn_0.67O₂ by vanadium substitution and organic electrolyte combinations for sodium-ion batteries. J. Solid State Electrochem. 27, 2067–2082 (2023).
Article CAS Google Scholar
Faber, F., Lindmaa, A., von Lilienfeld, O. A. & Armiento, R. Crystal structure representations for machine learning models of formation energies. Int. J. Quantum Chem. 115, 1094–1101 (2015).
Article CAS Google Scholar
Gingrich, T. R. & Wilson, M. On the ewald summation of gaussian charges for the simulation of metallic surfaces. Chem. Phys. Lett. 500, 178–183 (2010).
Article CAS Google Scholar
Zhang, H., Ge, H., Yang, J. & Tong, Y. Review of vehicle routing problems: models, classification and solving algorithms. Arch. Computational Methods Eng. 29, 195–221 (2022).
Article Google Scholar
Shamsaei, F. & van Vyve, M. Solving integrated production and condition-based maintenance planning problems by mip modeling. Flex. Serv. Manuf. J. 29, 184–202 (2017).
Article Google Scholar
Kronqvist, J., Bernal, D. E., Lundell, A. & Grossmann, I. E. A review and comparison of solvers for convex minlp. Optim. Eng. 20, 397–455 (2019).
Article Google Scholar
Thachuk, C., Shmygelska, A. & Hoos, H. H. A replica exchange Monte Carlo algorithm for protein folding in the hp model. BMC Bioinforma. 8, 342 (2007).
Article Google Scholar
Lipowski, A. & Lipowska, D. Roulette-wheel selection via stochastic acceptance. Phys. A Stat. Mech. Appl. 391, 2193–2196 (2012).
Article Google Scholar
Dugan, N. & Erkoç, Ş. Genetic algorithm–Monte Carlo hybrid geometry optimization method for atomic clusters. Computational Mater. Sci. 45, 127–132 (2009).
Article CAS Google Scholar
Sakae, Y., Hiroyasu, T., Miki, M., Ishii, K. & Okamoto, Y. Conformational search simulations of TRP-cage using genetic crossover. Mol. Simul. 41, 1045–1049 (2015).
Article CAS Google Scholar
Kim, H. et al. Ab initio study of the sodium intercalation and intermediate phases in Na_0.44MnO₂ for sodium-ion battery. Chem. Mater. 24, 1205–1211 (2012).
Article CAS Google Scholar
Delmas, C., Fouassier, C. & Hagenmuller, P. Structural classification and properties of the layered oxides. Phys. B+C. 99, 81–85 (1980).
Article CAS Google Scholar
Madelung, E. Das elektrische feld in systemen von regelmäßig angeordneten punktladungen. Phys. Z. 19, 32 (1918).
Google Scholar
Sakamoto, Y. Madelung constants of simple crystals expressed in terms of Born’s basic potentials of 15 figures. J. Chem. Phys. 28, 164–165 (1958).
Article CAS Google Scholar
Redhammer, G. J. et al. Wet-environment-induced structural alterations in single- and polycrystalline llzto solid electrolytes studied by diffraction techniques. ACS Appl. Mater. interfaces 13, 350–359 (2021).
Article CAS PubMed Google Scholar
Kannan, S. & Zacharias, M. Simulated annealing coupled replica exchange molecular dynamics–an efficient conformational sampling method. J. Struct. Biol. 166, 288–294 (2009).
Article CAS PubMed Google Scholar
Yao, L. et al. High–entropy and superstructure–stabilized layered oxide cathodes for sodium–ion batteries. Adv. Energy Mater. 12, 2201989 (2022).
Ade, P. A. R. et al. Planck 2015 results. Astron. Astrophys. 594, A13 (2016).
Article Google Scholar
Binninger, T. et al. Simulating charging characteristics of lithium iron phosphate by electro-ionic optimization on a quantum annealer. http://arxiv.org/pdf/2503.10581 (2025).
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B Condens. Matter. 54, 11169–11186 (1996).
Article CAS PubMed Google Scholar
Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B Condens. Matter. 50, 17953–17979 (1994).
Article PubMed Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Article CAS PubMed Google Scholar
Momma, K. & Izumi, F. Vesta: a three-dimensional visualization system for electronic and structural analysis. J. Appl. Crystallogr. 41, 653–658 (2008).
Article CAS Google Scholar
Thörnig, P. JURECA: data centric and booster modules implementing the modular supercomputing architecture at Jülich supercomputing centre. J. Large Scale Res. Facil. 7, A182 (2021).
Article Google Scholar

Download references

Acknowledgements

The presented work was carried out within the framework of the Helmholtz Association’s program Materials and Technologies for the Energy Transition, Topic 2: Electrochemical Energy Storage. Computation time granted through JARA HPC on the supercomputer JURECA⁹³ at Forschungszentrum Jülich under Grant No. jiek12 is gratefully acknowledged by the authors. K.K. and P.K. thank for the financial support from the “Deutsche Forschungsgemeinschaft” (DFG, German Research Foundation) under project No. 501562980.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Materials Synthesis and Processing (IMD-2), Institute of Energy Materials and Devices, Forschungszentrum Jülich GmbH, Jülich, Germany
Konstantin Köster & Payam Kaghazchi
MESA+ Institute, University of Twente, Enschede, NH, The Netherlands
Konstantin Köster & Payam Kaghazchi
Theory and Computation of Energy Materials (IET-3), Institute of Energy Technologies, Forschungszentrum Jülich GmbH, Jülich, Germany
Tobias Binninger

Authors

Konstantin Köster
View author publications
Search author on:PubMed Google Scholar
Tobias Binninger
View author publications
Search author on:PubMed Google Scholar
Payam Kaghazchi
View author publications
Search author on:PubMed Google Scholar

Contributions

K.K. performed the development, testing, and study design. T.B. supported in the implementation of the binary optimization model and REMC methods. All authors were involved in preparation and revision of the manuscript. P.K. supervised the study and was the lead P.I.

Corresponding author

Correspondence to Payam Kaghazchi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Köster, K., Binninger, T. & Kaghazchi, P. Optimization of Coulomb energies in gigantic configurational spaces of multi-element ionic crystals. npj Comput Mater 11, 202 (2025). https://doi.org/10.1038/s41524-025-01690-7

Download citation

Received: 02 June 2025
Accepted: 09 June 2025
Published: 01 July 2025
Version of record: 01 July 2025
DOI: https://doi.org/10.1038/s41524-025-01690-7

This article is cited by

Probing entropic control of stacking phase preference in layered oxide cathodes for sodium-ion batteries via machine-learning potentials
- Liang-Ting Wu
- Zhong-Lun Li
- Jyh-Chiang Jiang
npj Computational Materials (2026)

Subjects

Abstract

Similar content being viewed by others

Temperature relaxation in strongly-coupled binary ionic mixtures

Optimality guarantees for crystal structure prediction

Critical assessment of G0W0 calculations for 2D materials: the example of monolayer MoS2

Introduction

Results

Implementation and theoretical background

Pre-calculation of Coulomb energy terms

Optimization strategies for atomistic configurations

Interfacing to external exact optimizers

Internal Fortran heuristics in GOAC

Performance of exact optimization methods

Benchmark of heuristics in GOAC

Atomistic configurations in NaCl

Li-site occupation and Ta doping in LLZO

Layered oxide cathode materials

Discussion

Methods

DFT reference calculations

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Probing entropic control of stacking phase preference in layered oxide cathodes for sodium-ion batteries via machine-learning potentials

Search

Quick links

Critical assessment of G₀W₀ calculations for 2D materials: the example of monolayer MoS₂