Abstract
The semi-empirical pseudopotential method (SEPM) has been widely applied to provide computational insights into the electronic structure, photophysics, and charge carrier dynamics of nanoscale materials. We present “DeepPseudopot”, a machine-learned atomistic pseudopotential model that extends the SEPM framework by combining a flexible neural network representation of the local pseudopotential with parameterized non-local and spin-orbit coupling terms. Trained on bulk quasiparticle band structures and deformation potentials from GW calculations, the model captures many-body and relativistic effects with very high accuracy across diverse semiconducting materials, as illustrated for silicon and group III-V semiconductors. DeepPseudopot’s accuracy, efficiency, and transferability make it well-suited for data-driven in silico design and discovery of novel optoelectronic nanomaterials.
Similar content being viewed by others
Introduction
Semiconductor nanocrystals (NCs) exhibit size-dependent electronic and optical properties that enable their applications in a wide range of technologies1. The finite size leads to the discretization of electronic and vibrational states2, offering tunable fundamental band gaps3, optical absorption4, Auger lifetimes5, spectral linewidth6, and exciton-cooling dynamics7 that differ drastically from the corresponding bulk materials. Furthermore, engineered nano-heterostructures—such as core-shell NCs8, alloyed NCs9, NC arrays10, and systems with point defects11—can exhibit enhanced or novel properties. The ability to systematically tune composition, size, shape, and heterostructure underscores the potential of computational screening to accelerate the design and discovery of NC materials with tailored properties. Therefore, it is crucial to develop computational methods that can accurately describe the quantum properties of emerging NC materials of experimentally relevant sizes and shapes—including quasiparticle electronic structure, optical excitations, and electron-phonon couplings—within a feasible computational cost.
First-principles methods such as density functional theory (DFT) are widely used to study the electronic structure of materials12,13. However, due to the limitations of approximate exchange-correlation functionals, DFT often underestimates band gaps and yields inaccurate excitation energies compared to experimental measurements14. Several approaches have been used to partially correct these deficiencies at the mean-field level, including hybrid exchange-correlation functionals, the modified Becke-Johnson mBJ functional15, and the DFT + U method16,17. To address these limitations more systematically, state-of-the-art approaches employ many-body perturbation theory (MBPT) within the GW approximation, which more accurately accounts for electron-electron interactions via the expansion of the self-energy and provides improved predictions of quasiparticle energies and excited states18,19,20,21. Spin-orbit coupling can be incorporated into this approach, and optical absorption spectra can be obtained by solving the Bethe-Salpeter Equation (BSE) on top of GW22,23. Despite their improved accuracy, these first-principles methods incur a high computational cost, rendering them impractical for routine simulations of large NCs in the moderate to weak confinement regimes24. In particular, GW/BSE calculations have at least quartic scaling with system size, further compounding this challenge.
On the other hand, semi-empirical methods, such as the pseudopotential methods25,26,27, tight-binding (TB) models28,29, and Wannier-function-based TB models30,31 use parametrized Hamiltonians to simplify the electronic structure problem and significantly lower computational cost. The parameters are usually fitted to experimental data or high-level first-principles calculations (such as GW/BSE), offering alternatives for modeling large NC systems with a good balance of computational cost and accuracy32. Particularly, the local density-derived semi-empirical pseudopotential method (SEPM) and its variants have seen many fruitful applications in diverse semiconductor NC systems, providing insights into the electronic structure, photophysics, spectroscopy, and charge carrier dynamics33,34,35. A simple functional form that interpolates well across different form factors was used to describe the pseudopotential in reciprocal space, which was then numerically Fourier transformed to obtain the real-space potential, enabling quick adaptation to nanostructures with broken translational symmetry.
The resulting real-space NC Hamiltonians can be partially diagonalized with reduced computational cost using iterative methods such as filter diagonalization or Lanczos algorithms to target quasiparticle eigenstates near the band edge36,37,38. Later developments extended the SEPM to include non-local terms, spin-orbit coupling (SOC), local strain from deformation, and long-range effects35,39,40,41. Such extensions introduced more parameters in the pseudopotentials and increased the number of first-principles or experimental properties for the “fitting” procedure, such as spinor band structure, effective masses, deformation potentials, and electron-phonon coupling tensors.
Meanwhile, machine learning models have become popular in computational material science research42,43,44,45. By leveraging symmetry preservation and flexible function approximators—such as neural networks, kernel methods, and graph neural networks—these models learn an accurate representation of atomic interactions and can achieve near ab initio accuracy with a fraction of the computational cost46,47,48,49. Many of these models take the form of machine-learned interatomic potentials (MLIPs)50,51,52,53,54,55,56, which approximate the Born-Oppenheimer potential energy surface (PES) and enable efficient, transferable predictions of total energies and atomic forces. On the other hand, ML has also been integrated into electronic structure predictions by learning diverse representations57—including mean-field Hamiltonians58,59,60, electron densities61,62, many-body Green’s functions63, transferable pseudopotentials64,65,66, and tight-binding models67,68—enabling efficient simulation of the electronic structure in large and complex systems. However, it remains an active area of development to extend these models to describe more complex photophysical phenomena, accounting for non-local correlations, relativistic effects, and deformations resulting from electron-phonon couplings. Previous literature has mainly relied on DFT as the source of training data, which limits the ability to accurately capture band-edge physics and optical properties in semiconducting materials. To address this, incorporating training data beyond the density functional approximation, such as MBPT in the GW approximation, offers a promising path forward. In this post-DFT regime, training directly on quasiparticle eigenenergies becomes essential, as it circumvents the need to approximate inherently non-local, frequency-dependent self-energy terms with effective local potentials or densities.
In this work, we explore the use of machine learning techniques to parametrize semi-empirical pseudopotentials, facilitating their adaptation for computing the electronic, optical, and dynamical properties of novel nanomaterials. We developed a transferable deep-learning atomistic pseudopotential surrogate model (named “DeepPseudopot”), which combines a neural network local pseudopotential that captures local screened interactions, a non-local angular momentum-dependent correction term, and a spin-orbit coupling term to accurately reproduce electronic properties of extended bulk systems. We leverage the flexibility of neural network architectures and the universal approximation theorem69 to model the local pseudopotential in reciprocal space. Combined with the non-local and spin-orbit coupling terms, our model can accurately capture the pseudo-core potential with high precision and incorporate many-body electron interactions beyond the density functional approximation. The DeepPseudopot model is trained to reproduce bulk band structure energies across densely sampled high-symmetry paths in the Brillouin zones and hydrostatic volume deformation potentials obtained from DFT+GW calculations of known lattice phases of semiconductor materials. The model parameters—including the weights and biases of the neural network—are flexibly updated using the backpropagation algorithm, which significantly accelerates the fitting process compared to using a simple functional form and enables locating better fits with lower mean squared error (MSE). We demonstrate that properties like interband transition energies, effective masses, band dispersion, and deformation potentials are accurately captured by our DeepPseudopot model in example systems like Si and group III-V semiconductors. We also show that the resulting atomistic pseudopotentials are transferable, enabling efficient quasiparticle electronic structure calculations for large nanoclusters, alloyed bulk and confined systems at the DFT+GW level of theory with significantly lower computational cost. The DeepPseudopot model also integrates seamlessly with existing methods that further compute coupled electron-hole excitations, optical absorption spectra, electron-phonon coupling, and charge carrier dynamics, thereby enabling high-throughput exploration and design of nanomaterials.
This paper is structured as follows. In “Machine learning pseudopotential model and workflow”, we describe the deep-learning pseudopotential ("DeepPseudopot”) Hamiltonian and give details of the model training workflow, including data preparation, loss function construction, and model parameter optimization. In “Application to Si allotropes”, we demonstrate the DeepPseudopot model on the prototypical Si system. We discuss the advantage of the flexible neural network pseudopotential over simple functional forms of traditional SEPM in better fitting the bulk band energies and increased transferability to various lattice phases that are not present in the training data. In “Application to III-V semiconductors and alloyed nanocrystals”, we illustrate another example of DeepPseudopot on group III-V semiconductor systems (InAs, InP, GaAs, GaP) and the corresponding nanomaterials. We demonstrate that we can use a transferable atomistic DeepPseudopot model to capture the electron interactions across a class of materials. We also show qualitative agreements with experimental measurements in opto-electronic properties and electron-phonon interactions of both binary III-V NCs as well as in alloyed derivatives. In “Discussion”, we conclude and give an outlook for future development of machine-learned pseudopotential methods in solid state physics and nanomaterials science.
Results
Machine learning pseudopotential model and workflow
The machine learning semi-empirical pseudopotential model employs a non-local single-electron Hamiltonian to compute the quasiparticle band structures and deformation potentials in a plane wave spinor basis, as well as the electronic structure of nanoscale systems using a real-space grid basis. The Hamiltonian is given by
where \({\hat{V}}_{{\rm{loc}}}\) is the local pseudopotential that acts equally on all angular momentum channels, \({\hat{V}}_{{\rm{nl}}}\) is an angular momentum-dependent correction to the local pseudopotential, and \({\hat{V}}_{{\rm{soc}}}\) is the spin-orbit dependent pseudopotential. The pseudopotential terms are given as a sum over atom-centered potentials within the simulation cell, which corresponds to a single unit cell for bulk systems or the full nanocrystal for finite systems.
This approach generates the effective potential of a given geometry configuration in a single pass, bypassing the need for computationally intensive self-consistent field (SCF) iterations.
The local pseudopotential is modeled using a multi-layer fully connected neural network (as illustrated in Fig. 1 and Eq. (3)),70 taking as input the reciprocal space distance \(G=\left\vert {{\bf{G}}}_{i}-{{\bf{G}}}_{j}\right\vert\), where G denotes the reciprocal space basis. The output tensor size is equal to the number of atom types in the system(s). To enforce the decay of the local pseudopotential in reciprocal space and improve convergence with respect to the kinetic energy cutoff, the activation function in the final layer is replaced with a Gaussian function. The local pseudopotential is given by:
where hi(x) = σ(Wix + bi) is the output of the i-th hidden layer, σ is an activation function, and Wi and bi are the weights and bias tensors70. In the plane wave spinor basis \(\left\vert {\bf{K}},s\right\rangle\), the local pseudopotential Hamiltonian matrix elements at wavevector k are expressed as:
where K = k + G for all G of the reciprocal space basis, \({e}^{i\left({{\bf{G}}}_{i}-{{\bf{G}}}_{j}\right)\cdot {{\bf{R}}}_{{\mathbf{\alpha }}}}\) is the structure factor \({S}^{\alpha }\left(G\right)\) for atom α, and Ω is the unit cell volume. We assume spherical symmetry of the local pseudopotential around each atom, which simplifies and accelerates the inverse Fourier transform to real space. The atomic potential does not explicitly encode neighbor information, and is shown to be sufficient for the systems studied here. Environment-dependent descriptors can be systematically incorporated into the model to address more complex systems, as discussed in “Discussion”. The resulting continuous local potential can be used to construct the nanocrystal potential on a grid basis. Asymmetry in the total electronic potential is captured by the non-local and spin-orbit coupling terms.
a Reference data generation. Quasiparticle band structures and hydrostatic deformation potentials are computed using DFT+GW for multiple crystal structures. b Model setup. The atomistic machine learning model is initialized, with the local pseudopotential represented by a neural network and the non-local and spin-orbit coupling terms modeled by parameterized functional forms. c Hamiltonian construction and model training. The DeepPseudopot Hamiltonian is constructed from structure factors, wavevector data, and the model pseudopotentials, then diagonalized to obtain the predicted quasiparticle band structures and deformation potentials. The model is trained by minimizing the loss function based on these properties.
The non-local and SOC pseudopotentials correct the local term, capturing angular-momentum-dependent and relativistic effects. In our model, both terms are assumed to act only on the l = 1 angular momentum channel (we assume l = 0 is the local channel) and are represented using simple analytic forms inspired by earlier work25,27,40,71.
where \({\hat{P}}_{l = 1}^{\alpha }\) is the projector onto the l = 1 orbitals of atom type α, \({\hat{{\bf{L}}}}^{\alpha }\) is the orbital angular momentum operator, \(\hat{{\bf{S}}}\) is the spin operator, and \({\theta }_{{\rm{nl1}}}^{\alpha },{\theta }_{{\rm{nl2}}}^{\alpha },{\theta }_{{\rm{soc}}}^{\alpha }\) are the non-local and SOC parameters. We choose to use simple piece-wise Gaussian functions with adjustable prefactors instead of neural networks due to the otherwise prohibitively high computational cost of evaluating and converging the matrix elements in the plane wave spinor basis, given by Eq. (7). To further reduce computational overhead, the parameters ρ and w were fixed to 1.5 and 0.7 Bohr, respectively. As shown in this work, these approximations—when coupled with the flexibility of the neural network representation of the local pseudopotential—are sufficient to achieve high-quality fits to bulk band structure properties.
In the plane wave spinor basis, the matrix elements of the non-local and SOC pseudopotentials are expressed as
where the integrals are defined as \({I}_{{\rm{nl}}}=\mathop{\int}\nolimits_{0}^{\infty }dr\,{r}^{2}\,{j}_{1}\left({K}_{i}r\right)\left({\theta}_{{\rm{nl1}}}^{\alpha }{e}^{-{r}^{2}}+{\theta}_{{\rm{nl2}}}^{\alpha }{e}^{-{\left(r-\rho \right)}^{2}}\right){j}_{1}\left({K}_{j}r\right)\), \({I}_{{\rm{soc}}}=\mathop{\int}\nolimits_{0}^{\infty }dr\,{r}^{2}\,{j}_{1}\left({K}_{i}r\right)\left({\theta}_{{\rm{soc}}}^{\alpha }{e}^{-\frac{{r}^{2}}{{w}^{2}}}\right){j}_{1}\left({K}_{j}r\right)\), j1(Kjr) is the spherical Bessel function of order 1, and \({{\bf{S}}}_{{s}_{i},{s}_{j}}\) are matrix elements of the spin operator.
Highly accurate reference data from DFT+GW calculations were prepared to train the DeepPseudopot parameters, including the neural network weights and biases for the local pseudopotentials as well as the non-local and SOC parameters. Specifically, we generated band structure data along a densely sampled high-symmetry path in the Brillouin zone and extracted hydrostatic volume deformation potentials. The band structure data trains the pseudopotential model to capture electronic eigenenergies and their dispersion, while the deformation potentials quantify how the eigenenergies respond to local lattice strains, which is crucial for accurately modeling perturbative electron-phonon coupling. The MBPT correction within GW approximation is essential in this workflow, as it addresses the self-interaction error in standard density functional approximations. Compared to conventional DFT, GW provides significantly more accurate quasiparticle band gaps and excited-state properties, yielding better agreement with experimental measurements in semiconducting materials. Hydrostatic volume deformation potentials describe the change and sensitivity of interband transition energies under isotropic strain:
where Et is the transition energy and V is the (deformed) cell volume. To avoid complications associated with the ambiguous absolute energy reference in periodic systems72,73, we only calculated deformation potentials for interband transitions. These quantities were evaluated using a finite difference approach by uniformly expanding and contracting the unit cell and extracting the deformation potential from the slope of the transition energy. This workflow of data preparation from high level of theory can be easily extended to other bulk properties, including electron-phonon coupling tensors, dielectric constants, and charge densities.
The overall training workflow for the DeepPseudopot model is illustrated in Fig. 1. During training, we iterated over the wavevectors k and constructed the DeepPseudopot Hamiltonian, including the kinetic energy, local pseudopotential, non-local, and spin-orbit coupling terms, in a converged plane wave spinor basis. We computed the eigenvalues of the resulting Hamiltonian using the complex Hermitian eigenvalue solver (torch.linalg.eigvalsh()) from PyTorch, which allows for efficient backpropagation through the operation. Degenerate eigenvalues were distinguished by maximizing the overlap of eigenvectors between adjacent k-points, following a Wannier-like process74. We also calculated the deformation potential at specified interband transitions via the same finite difference approach. The loss function includes contributions from the band structure (BS) mean-squared error, the deformation potential (defPot) mean-squared error, and an optional decay penalty (decay) on the local pseudopotential
where Enk and \({\tilde{E}}_{n{\bf{k}}}\) are the predicted and reference eigenvalues for band n at wavevector k. \({a}_{i}^{V}\) and \({\tilde{a}}_{i}^{V}\) are the predicted and reference deformation potential for transition i. \(f_c\,\left(G\right)=\frac{1}{1+{e}^{-k\left(G-{G}_{{\rm{cut}}}\right)}}\) is a shifted sigmoid function that smoothly penalizes non-decaying components of the local pseudopotential beyond a cutoff momentum. \({w}_{n{\bf{k}}}^{BS},{w}_{i}^{{\rm{defPot}}},{w}_{\alpha }^{{\rm{decay}}}\) are tunable hyperparameters for balancing the loss function. To emphasize accurate reproduction of band-edge physics, we used heavy weights on bands near the conduction band (CB) and valence band (VB) edges, and on k-points critical to the material’s electronic structure.
Application to Si allotropes
We first demonstrate the versatility and applicability of the DeepPseudopot model using silicon as a test case. The model was trained on DFT+GW reference data for cubic diamond phase silicon, with training and data generation procedures detailed in "Methods".
The trained silicon machine-learned pseudopotential achieves high accuracy in reproducing reference electronic structure data, as illustrated in Fig. 2. The band structure predicted by the trained DeepPseudopot model closely matches the GW reference along the entire high-symmetry path, accurately capturing both energies and dispersions. To enhance accuracy at the band edges, the training loss function applied double weights to k-points at Γ, X, and at the CBM along the Γ − X path. As shown in Fig. 2b, the resulting model reproduces interband transition energies at high-symmetry points with deviations of less than 0.050 eV. Notably, it predicts the fundamental band gap with exceptional precision: 1.136 eV from DeepPseudopot versus 1.137 eV from DFT+GW. In addition, the model accurately reproduces effective masses and deformation potentials (see Fig. 2d), outperforming fits based on simple analytical forms of the local pseudopotential. This high level of agreement indicates that the model can faithfully capture the local electronic potentials, many-body interactions, and perturbative properties in the prototypical Si system.
The reference DFT+GW data (red), the DeepPseudopot model predictions (blue), and the simple functional form pseudopotential fitted using Monte Carlo sampling (yellow) are consistently color-coded across all panels. a Band structure of cubic diamond silicon. b Accuracy matrix for interband transition energies between high-symmetry points in the Brillouin zone. Grid colors and blue text show the absolute energy errors between reference values and the DeepPseudopot prediction. Yellow text shows the corresponding errors from the simple functional form pseudopotential for comparison. c Local pseudopotentials plotted in reciprocal space and real space. The simple functional form pseudopotential fitted using gradient descent is shown as grey dashed lines. d Effective masses (top) and deformation potentials (bottom). Insets show zoomed-in comparisons of effective masses. e Training loss evolution starting from a random initialization.
We also compared the training efficiency of the DeepPseudopot model against earlier semi-empirical pseudopotential frameworks27,35. Traditional pseudopotentials typically used simple functional forms with only a few tunable parameters, which were adjusted to reproduce band energies. Despite the small parameter space, prior work often relied on stochastic sampling techniques such as Monte Carlo sampling, due to the rugged parameter landscape and the complexity of the eigenvalue operator. To benchmark training performance, we implemented both Monte Carlo (MC) and gradient descent (GD) optimization for the simple functional form, which uses the numerical back-propagation implementation via PyTorch. A common random initialization was selected, with the DeepPseudopot model first trained to reproduce the same initial pseudopotential function as the other two methods to ensure a fair comparison. As depicted in Fig. 2e, the DeepPseudopot model illustrates improved efficiency, needing fewer iterations of band structure evaluations for a comparable fit. It achieves comparable error levels to MC with less than one-fifth of the computational cost, due to its flexible representation enabling better optimization with respect to the input data.
In contrast, GD on the simple functional form often becomes trapped in a suboptimal local minimum, while MC improves the fit but remains slower and less accurate than DeepPseudopot. Given the empirical nature of the training process and the complexity of the loss landscape, the advantage of DeepPseudopot is robust across runs, but the relative improvement varies with the random initialization. Although demonstrated here for the simple silicon system, the efficiency and flexibility of DeepPseudopot become increasingly important for more complex unit cells. The ability to reach a better fit, reflected in a lower minima of the loss function is critical for achieving the high transferability required of machine-learned pseudopotentials, as demonstrated below.
The trained DeepPseudopot model not only reproduces GW-level reference band properties and deformation potentials at high accuracy within the cubic diamond (cd) phase used for training, but also demonstrates great transferability to other silicon allotropes, as shown in Fig. 3. To assess the model’s predictive performance on unseen structures, we applied it to two additional semiconducting phases of silicon: the hexagonal diamond (lonsdaleite) structure and the body-centered tetragonal (bct) structure, both of which have been studied theoretically or experimentally in literature75,76. While these lattice structures preserve four-fold silicon atom coordination, they exhibit different bond lengths and local atomic environments compared to the cd structure. These structural variations present interesting cases for assessing the model’s transferability.
Consistent with Fig. 2, reference DFT+GW data are shown in red, DeepPseudopot model predictions in blue, and simple functional form pseudopotentials in yellow. a The interband transition energies between the valence band maximum and various k-points of the conduction band edge in the lonsdaleite structure. The fundamental band gaps are highlighted. b The reference and predicted band structures of the lonsdaleite structure. (c, d) Same as a and b, but for the bct structure.
In both the lonsdaleite and bct structures, the DeepPseudopot model accurately reproduces the GW band dispersions and fundamental band gaps, as shown in Fig. 3. The predicted transition energies between the VBM and the CB edges at various k-points deviate by less than 0.150 eV from the GW reference. For comparison, we also computed the band structures using the traditional pseudopotential based on simple analytical functional forms, and trained using MC sampling as in Fig. 2e. Although both models achieved comparable training loss within the cd phase, the more flexible DeepPseudopot model consistently outperforms the simple functional form pseudopotential in prediction tasks on unseen allotropes. The latter severely underestimates the band gaps and misrepresents band dispersions and crossings in the lonsdaleite and bct phases (see Fig. 3b, d).
However, we note that this level of generalizability without additional retraining or explicit inclusion of atomic environment in the model is not guaranteed across more complex materials and most likely succeeds here due to the relative simplicity of the silicon phases and the similarity between the lonsdaleite, bct, and cd structures. One example where DeepPseudopot shows limited transferability is in the conduction band minimum (CBM) prediction for the bct structure. In the GW reference calculation, the CBM of the bct structure is located at the P point, with a competing minimum along the Γ − Z path only 0.044 eV higher in energy. In contrast, the DeepPseudopot prediction incorrectly identifies the CBM along the Γ − Z path (see Fig. 3c), showing the challenges of resolving very small energy differences without explicit retraining. Furthermore, the current model would not be expected to transfer well to amorphous silicon, where the local atomic environments differ significantly, including variations in silicon coordination numbers. To systematically improve transferability, one can expand the training set to include more phases65, enabling the flexible machine-learned local pseudopotential to efficiently extrapolate across diverse structural environments.
Application to III-V semiconductors and alloyed nanocrystals
The SEPM has been widely fitted across multiple crystal structures—such as wurtzite and zinc blende CdSe33,34—but its ability to generalize across alloyed systems is less explored. Here, we show how training a DeepPseudopot model on a set of four group III-V semiconductor compounds provides an accurate route to the electronic and vibronic properties of binary-compound and ternary-alloyed nanoscale crystal systems in comparison to experimental measurements.
The model was trained on DFT+GW quasiparticle band structures and deformation potentials of InAs, InP, GaAs, and GaP, prepared using the procedures described in "Methods". Importantly, spin-orbit coupling was explicitly included in the reference calculations, and each band structure was statically shifted to ensure consistent band alignment across materials. To enable transferability to alloyed systems, each elemental pseudopotential (cation or anion) was shared across the two compounds in which the element appears, without interpolation or compound-specific tuning.
As shown in Fig. 4, the trained DeepPseudopot model on group III-V semiconductors accurately reproduces all band-edge properties nearly perfectly across the four materials. To quantify this accuracy, we measured the deviations in quasiparticle transition energies between the VBM and the CB at the Γ, X, L points. This comparison is particularly relevant for predicting alloy behavior since GaP has CBM at X, unlike the other three compounds with CBM at Γ. Moreover, the CB edge at Γ, X, and L in GaP is close in energy, enabling direct-to-indirect gap transitions in III-V alloys involving GaP. The DeepPseudopot model captures these CB energies and the spin-orbit splitting energies within 0.080 eV, laying the foundation for reliable electronic structure predictions in nanosystems. We note that band identity is not explicitly tracked in the training data or model output, as bands are ordered solely by energy at each k-points. As a result, true band crossings may appear visually as avoided crossings in the plotted results, particularly for certain high-energy conduction bands in Fig. 4a. Despite this, the predicted eigenvectors and corresponding charge densities qualitatively match those from mean-field calculations, further showing that the trained DeepPseudopot accurately captures the underlying physics. In Fig. 4b, we visualize the learned local pseudopotentials in real space, where clear and physically meaningful similarities emerge between the cation species and between the anion species.
Reference DFT+GW data are shown in red throughout; predictions from the trained DeepPseudopot model are shown in blue. a Band structures of InAs, InP, GaAs, GaP, from the trained DeepPseudopot model (blue lines) compared to GW reference (red dots). Insets show zoomed-in views around the VBM, highlighting the spin-orbit splitting energy. b Real-space local pseudopotentials for each element. c Predicted band structure of alloyed In0.5Ga0.5P around the Γ point. The inset shows the 8-atom zincblende conventional cell with cation substitutions used in the calculation. Atom colors: Pink-In, Green-Ga, Purple-P. d Fundamental band gaps of bulk In1−xGaxP alloy supercells.
To assess the transferability of the trained DeepPseudopot model beyond its training set, we tested its predictions on bulk III-V alloys. Specifically, we constructed 8-atom zincblende conventional cells of In1−xGaxP alloys at simple fractional compositions (x = 0.25, 0.5, 0.75), which represent the smallest special quasirandom structures commensurate with the chosen stoichiometry77,78. Each alloy supercell was relaxed prior to band structure calculations near Γ. Despite the absence of explicit In-Ga-P interactions in the training data, the predicted band structures from the DeepPseudopot model closely match the GW reference (see Fig. 4c), and the fundamental band gaps across In1−xGaxP compositions qualitatively reproduce ab initio trends with very small errors (see Fig. 4d). This test on bulk alloys demonstrates the model’s ability to generalize beyond the training set of binary semiconductor primitive cells and accurately capture electronic structures of bulk alloy supercells, thanks to its physically motivated Hamiltonian design and elemental sharing parametrization.
We then evaluated the predictive capabilities of our trained spinor, non-local DeepPseudopot model for group III-V semiconductors on a variety of nanoscale crystalline systems comprised of the binary III-V compounds, focusing on their optoelectronic properties and electron-phonon coupling. Details of the calculations of electronic, optical, and electron-phonon properties in nanocrystals using the trained machine-learned pseudopotentials are provided in "Methods".
As shown in Fig. 5, the optoelectronic properties and electron-phonon couplings calculated using the DeepPseudopot model show good agreement with experimental measurements on binary semiconductor nanocrystals. Figure 5a illustrates the quantum confinement effect for InAs, InP, GaAs, and GaP NCs as a function of size. The optical gaps for all four materials correctly follow the trend of their bulk band gaps, with InAs exhibiting the smallest and GaP the largest gaps. The calculated optical gaps also quantitatively agree with experimental measurements, with mean absolute errors of 0.057 eV for InAs, 0.113 eV for InP, and 0.046 eV for GaAs nanocrystals of various sizes79,80,81,82. These results confirm that the DeepPseudopot model captures size-dependent quantum confinement trends and achieves high accuracy in predicting optoelectronic properties across binary III-V nanocrystals. As a representative example, Fig. 5b shows the computed absorption spectrum of a 4.0 nm InP NC, where exciton state energies and their OS are represented by bars below and above the axis. The model qualitatively reproduces experimental absorption features, accurately capturing both the first absorption peak position and spectral line shape.
a Size-dependent optical gaps of InAs, InP, GaAs, and GaP NCs compared with experiments (hollow squares)79,80,81,82. b Calculated (orange solid line) and experimental (black dash-dot) absorption spectra for a 4.0 nm InP NC. Exciton energies (bars below axis) and oscillator strengths (bars above axis) are shown. c Exciton-phonon coupling spectral density (green) and phonon density of states (black dashed line) of a 6.0 nm GaAs NC. d Calculated (solid dots) and experimental (hollow squares) Stokes shifts for GaAs NCs as a function of size. e Fundamental band gaps of In1−xGaxP and GaP1−xAsx ternary alloys. Bulk alloy gaps at the Γ, X, L valleys (dotted lines) were interpolated using a simple quadratic form with experimental bowing parameters87, with the lowest-energy branch at each composition highlighted as the solid line. Bulk direct-to-indirect crossover compositions are annotated in purple. The NC gaps are shown as dots colored by the dominant GaP valley character (inset). The grey line shows average NC gaps across three random alloy configurations per composition.
In Fig. 5c and d, we further validate the model’s exciton-phonon coupling calculations using GaAs NCs. The spectral density calculated using the DeepPseudopot model with phonon density of states (Fig. 5c) obtained from a force field, show structured couplings primarily to a few acoustic phonon modes and strong coupling to discrete optical phonon modes, consistent with prior findings for other semiconductor NCs83,84,85. To quantitatively benchmark the overall exciton-phonon coupling strength, we computed the Stokes shift—a collective measure of exciton fine structure—and reorganization energy, reflecting overall electron-phonon coupling. As illustrated in Fig. 5d, the calculated Stokes shifts for GaAs NCs agree closely with experimental measurements across sizes82, underscoring the model’s capability to accurately predict exciton-phonon coupling strengths in nanoscale systems.
Inspired by recent experimental synthesis development of alloyed III-V NCs in molten salt solvents, we also tested the transferability of the DeepPseudopot model to predict electronic structures of ternary alloyed nanoscale systems. Geometries of In1−xGaxP and GaP1−xAsx were constructed via random ion exchange starting from pristine tetrahedral InP or GaP NCs, a procedure that mirrors the experimental synthesis pathway86. For each alloy composition, we generated three independent, randomly alloyed configurations. These particular alloying systems were chosen due to their intriguing direct-to-indirect band gap transitions involving GaP, which is continuously tunable by adjusting the alloy composition.
Applying the DeepPseudopot model validated on binary III-V NCs, we predicted the fundamental gaps of ternary alloyed NCs and compared them to theoretical bulk trends. Figure 5e shows the bulk direct-to-indirect-gap crossover compositions, obtained via quadratic interpolations between binary compounds using GW quasiparticle interband transition energies at Γ, X, L. Bowing parameters from experimental data were used to account for deviations from linear behavior87. Experimental determination of fundamental gaps in the indirect-gap regime is challenging due to their weak optical emission, making theoretical validation especially valuable. As shown in Fig. 5e, the predicted NC fundamental gaps are consistently larger than the bulk values due to quantum confinement, and exhibit nonlinear composition dependence with inflection points closely aligning with bulk crossover compositions.
In addition, we evaluated the “majority representation” coefficients of the CBM states by projecting their quasiparticle wavefunctions onto the bulk GaP Bloch wavefunctions at the direct (Γ) and indirect (X, L) valleys (see Fig. 5e, inset). Each NC CBM state was classified as either “Γ-like” (blue) or “X-like” (red). The evolution of these CBM state characters closely tracks the observed linearity changes in the fundamental gap, clearly reflecting the direct-to-indirect-gap transition in In1−xGaxP and GaP1−xAsx alloyed NCs, despite the broken translational symmetry and the ill-defined nature of quasi-momentum in confined nanoscale systems.
Discussion
We developed DeepPseudopot, a machine-learning atomistic semi-empirical pseudopotential surrogate model capable of reproducing DFT+GW-level electronic structure properties with very high precision across a diverse set of elemental and compound semiconductors. The model combines a flexible neural network architecture of the local screened pseudopotentials with analytically tractable non-local and spin-orbit coupling terms to capture angular-momentum-dependent and relativistic effects. Physically motivated design choices—including species-specific potential sharing without interpolation, reciprocal-space decay regularization, and targeted loss function weighting at key bands and k-points—enable accurate description of band-edge physics, including quasiparticle energies, deformation potentials, and effective masses. Applied to silicon and group III-V semiconductors (InAs, InP, GaAs, GaP), DeepPseudopot achieves quantitative agreement with GW reference data and significantly outperforms traditional analytic semi-empirical pseudopotentials in both accuracy and training efficiency. The model was shown to generalize well to unseen crystal phases, bulk alloys, large nanostructures, as well as alloyed nanostructures, capturing essential features of electronic, optical, and vibronic properties with no additional retraining. This showcases that the DeepPseudopot model can learn very efficiently from only a small dataset of bulk band structures and deformation potentials, then directly applied to perform transferable, highly accurate electronic structure calculations for large nanoscale systems at a fraction of the computational cost of the reference theory. DeepPseudopot is thus positioned as a broadly applicable tool for the data-driven design and discovery of complex nanomaterials with tailored properties.
Despite accurately describing silicon and III-V alloys, several areas in DeepPseudopot require further development. The non-local and SOC terms are currently confined to one angular momentum channel and are represented by simple analytical functions with limited tunable parameters, a design choice necessitated by the high computational cost of evaluating these terms in the spinor plane wave basis. Developing more efficient algorithms for constructing these matrices could enable higher angular momentum projections and allow for flexible neural network representations. Additionally, the local term assumes spherical symmetry and introducing symmetry-preserving descriptors of the local atomic environment, as suggested by Kim and Son65, could improve the DeepPseudopot performance. Furthermore, the current local pseudopotential lacks explicit long-range treatment56, which, while sufficient for capturing bulk deformation potentials and exciton-phonon coupling in nanocrystals, may not accurately model Frohlich-type electron-phonon interactions41. These developments would broaden DeepPseudopot’s applicability to a wider range of materials, and will be the subject of future development.
Methods
Machine learning
Our proposed machine learning semi-empirical pseudopotential may be conveniently implemented via all common ML packages. We built our model via PyTorch. Model parameters were optimized using the Adam algorithm88 with an initial learning rate β = 0.002 and an exponential decay scheduler. Training was parallelized over k-points for computational efficiency. The fully trained models were achieved after roughly 2000 epochs for silicon and 15,000 epochs for the III-V systems. A step-by-step model training workflow is illustrated in Fig. 1. Starting from atomic structures, the model generates the reciprocal-space basis and structure factors, constructs the kinetic, local, non-local, and SOC components of the Hamiltonian, and trains its parameters against reference GW band-structure and deformation-potential data.
For silicon, the local pseudopotential was parameterized by a fully connected neural network with a single hidden layer containing 20 neurons and the Continuously Differentiable Exponential Linear Unit (CELU) activation function89, except in the output layer as described in “Machine learning pseudopotential model and workflow”. He initialization90 was used to set initial weights. To promote smooth behavior in reciprocal space, a small regularization term was included in the loss function to penalize nonzero components of the local pseudopotential beyond the momentum cutoff of 4.5 Bohr−1. Spin-orbit coupling and non-local potentials were omitted during training, as SOC effects in silicon are negligible and an accurate band structure could be trained without including non-local angular momentum-dependent corrections.
For the III-V materials, the DeepPseudopot model was constructed with one hidden layer of 50 neurons and the CELU activation function, with four outputs corresponding to the local pseudopotentials of P, Ga, As, and In. Four accompanying SOC parameters were also included and initialized randomly. Each elemental pseudopotential (cation or anion) was shared between the two materials in which the element appears (e.g., In shared between InP and InAs), without any interpolation across systems. This design is particularly important for modeling alloyed systems. To prioritize accurate reproduction of band-edge physics, the loss function used to train the DeepPseudopot model was heavily weighted towards bands near the gap, the spin split-off bands at Γ point, and Γ, X and L points in the Brillouin zone.
Ab-initio bulk band property calculation
Reference DFT+GW data were generated for silicon and the III-V semiconductors InAs, InP, GaAs, and GaP. These data include quasiparticle band structures and hydrostatic volume deformation potentials for key interband transitions, which were used to train the DeepPseudopot model.
For the mean-field calculations of silicon, we used the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional91 with norm-conserving, scalar relativistic pseudopotentials as implemented in Quantum ESPRESSO92. Spin-orbit coupling effects were neglected, as they are known to be weak in Si. A kinetic energy cutoff of 120 Ry was used, and the Brillouin zone was sampled using a Monkhorst-Pack 8 × 8 × 8 k-point mesh in the self-consistent DFT calculation. The GW corrections were performed using the BerkeleyGW package within the single-shot G0W0 approximation93. The energy cutoff for the screened Coulomb interaction, as well as the number of bands used in the screened Coulomb and Coulomb-hole summations, were converged to ensure numerical accuracy. Additionally, we computed the deformation potentials for interband transitions Γ15v − Γ1c, Γ15v − X1c, Γ15v − L1c at the same level of DFT+GW theory using a unit cell with ± 1% isotropic deformation of the lattice constant, following Eq. (8).
For InAs, InP, GaAs, and GaP, reference GW calculations were carried out using similar procedures, with several important distinctions. Among these III-V semiconductors, heavier elements such as In and As are known to lead to significant SOC effects. To ensure consistency, fully relativistic pseudopotentials were employed for all elements in the DFT calculations. In the case of InAs, due to the small fundamental gap, DFT calculation using the PBE functional yields a semi-metallic system. Thus, we performed a second iteration of the screened exchange summation using updated G0W0 quasiparticle energies to correctly account for state occupations94. Given that the model was trained across multiple compounds and will be applied to nano-heterostructures, consistent band alignment was essential. We statically shifted each band structure so that its VBM aligns with its experimental work function95. Deformation potentials were computed using the same procedure and included in the training dataset. Bulk alloy band properties were computed using the same procedures, applied to 8-atom zincblende conventional cells with appropriate elemental substitutions. Each alloy cell was first relaxed using DFT with the PBE functional and spin-orbit coupling included, followed by GW quasiparticle band structure calculations using the same approach described above.
Machine-learned pseudopotential calculation of nanocrystal properties
The nanocrystal structures were constructed by cutting desired geometries from bulk lattices, followed by structural relaxation using a previously parameterized Tersoff-type force field96 and surface passivation with ligand potentials33. The NC Hamiltonians were constructed using the trained real-space pseudopotentials on a finely spaced real-space spinor grid basis with 0.5 Bohr spacing, ensuring an accurate representation of the non-local and spin-orbit coupling terms via the projector formalism and the convergence of eigenvalues. The quasiparticle eigenstates near the band edges were efficiently computed using the filter diagonalization method.
Correlated electron-hole excitations (exciton states) were obtained by solving the Bethe-Salpeter equation within the static screening approximation, using the quasiparticle states obtained from the DeepPseudopot Hamiltonian as the electron-hole product basis. Size-dependent dielectric constants required for BSE calculations were estimated from bulk values using the generalized Penn model97. Oscillator strengths (OS) were calculated from the transition dipole moments between the ground and excitonic states. Phonon frequencies were obtained by diagonalizing the Hessian matrix constructed using a classical Tersoff-type force field for computational efficiency96. First-order exciton-phonon couplings were computed via numerical differentiation of the real-space pseudopotentials98. To simulate experimentally measured Stokes shifts, we subtracted the emission peak energy—calculated from exciton energies redshifted by twice the reorganization energy—from the first peak of the absorption spectrum82. More details on the methods for III-V NC construction, BSE, oscillator strength, and Stokes shift calculations can be found in previous work35,86.
Data availability
The machine-learned pseudopotential parameters, trained and reference band structure data, and nanocrystal simulation results are available on Figshare at https://doi.org/10.6084/m9.figshare.29321645.
Code availability
The DeepPseudopot package is publicly available at https://github.com/TommyLinkl/DeePseudopot.git.
References
Alivisatos, A. P. Perspectives on the physical chemistry of semiconductor nanocrystals. J. Phys. Chem. 100, 13226–13239 (1996).
Brus, L. E. A simple model for the ionization potential, electron affinity, and aqueous redox potentials of small semiconductor crystallites. J. Chem. Phys. 79, 5566 (1983).
Bawendi, M. G. et al. Electronic structure and photoexcited-carrier dynamics in nanometer-size CdSe clusters. Phys. Rev. Lett. 65, 1623 (1990).
Gómez, D. E., Califano, M. & Mulvaney, P. Optical properties of single semiconductor nanocrystals. Phys. Chem. Chem. Phys. 8, 4989 (2006).
Klimov, V. I., Mikhailovsky, A. A., McBranch, D. W., Leatherdale, C. A. & Bawendi, M. G. Quantization of Multiparticle Auger Rates in Semiconductor Quantum Dots. Science 287, 1011–1013 (2000).
Norris, D. J. & Bawendi, M. G. Measurement and assignment of the size-dependent optical spectrum in CdSe quantum dots. Phys. Rev. B 53, 16338–16346 (1996).
Kumar, M. et al. Hot exciton cooling and multiple exciton generation in PbSe quantum dots. Phys. Chem. Chem. Phys. 18, 31107–31114 (2016).
Reiss, P., Protiére, M. & Li, L. Core/Shell semiconductor nanocrystals. Small 5, 154–168 (2009).
Bailey, R. E. & Nie, S. Alloyed semiconductor quantum dots: tuning the optical properties without changing the particle size. J. Am. Chem. Soc. 125, 7100–7106 (2003).
Hensgens, T. et al. Quantum simulation of a Fermi-Hubbard model using a semiconductor quantum dot array. Nature 548, 70–73 (2017).
Babentsov, V. & Sizov, F. Defects in quantum dots of IIB-VI semiconductors. Opto-Electron. Rev. 16, 208–225 (2008).
Jain, A., Shin, Y. & Persson, K. A. Computational predictions of energy materials using density functional theory. Nat. Rev. Mater. 1, 1–13 (2016).
Lejaeghere, K. et al. Reproducibility in density functional theory calculations of solids. Science 351, aad3000 (2016).
Cohen, A. J., Mori-Sánchez, P. & Yang, W. Challenges for density functional theory. Chem. Rev. 112, 289–320 (2012).
Tran, F. & Blaha, P. Accurate band gaps of semiconductors and insulators with a semilocal exchange-correlation potential. Phys. Rev. Lett. 102, 226401 (2009).
Dudarev, S. L., Liechtenstein, A. I., Castell, M. R., Briggs, G. A. D. & Sutton, A. P. Surface states on NiO (100) and the origin of the contrast reversal in atomically resolved scanning tunneling microscope images. Phys. Rev. B 56, 4900–4908 (1997).
Anisimov, V. I., Aryasetiawan, F. & Lichtenstein, A. I. First-principles calculations of the electronic structure and spectra of strongly correlated systems: the LDA+ U method. J. Phys.: Condens. Matter 9, 767 (1997).
Hybertsen, M. S. & Louie, S. G. Electron correlation in semiconductors and insulators: Band gaps and quasiparticle energies. Phys. Rev. B 34, 5390–5413 (1986).
Govoni, M. & Galli, G. Large Scale GW Calculations. J. Chem. Theory Comput. 11, 2680–2696 (2015).
Scherpelz, P., Govoni, M., Hamada, I. & Galli, G. Implementation and validation of fully relativistic gw calculations: spin-orbit coupling in molecules, nanocrystals, and solids. J. Chem. Theory Comput. 12, 3523–3544 (2016).
Golze, D., Dvorak, M. & Rinke, P. The GW Compendium: A Practical Guide to Theoretical Photoemission Spectroscopy. Front. Chem. 7 (2019).
Rohlfing, M. & Louie, S. G. Electron-hole excitations and optical spectra from first principles. Phys. Rev. B 62, 4927–4944 (2000).
Blase, X., Duchemin, I., Jacquemin, D. & Loos, P.-F. The Bethe-Salpeter equation formalism: from physics to chemistry. J. Phys. Chem. Lett. 11, 7371–7382 (2020).
Makkar, P. & Nath-Ghosh, N. A review on the use of DFT for the prediction of the properties of nanomaterials. RSC Adv. 11, 27897–27924 (2021).
Chelikowsky, J. R. & Cohen, M. L. Nonlocal pseudopotential calculations for the electronic structure of eleven diamond and zinc-blende semiconductors. Phys. Rev. B 14, 556–582 (1976).
Cohen, M. L. Application of the pseudopotential model to solids. Annu. Rev. Mater. Res. 14, 119–144 (1984).
Wang, L.-W. & Zunger, A. Local-density-derived semiempirical pseudopotentials. Phys. Rev. B 51, 17398–17416 (1995).
Sutton, A. P., Finnis, M. W., Pettifor, D. G. & Ohta, Y. The tight-binding bond model. J. Phys. C: Solid State Phys. 21, 35 (1988).
Kwon, I., Biswas, R., Wang, C. Z., Ho, K. M. & Soukoulis, C. M. Transferable tight-binding models for silicon. Phys. Rev. B 49, 7242–7250 (1994).
Hamann, D. R. & Vanderbilt, D. Maximally localized Wannier functions for GW quasiparticles. Phys. Rev. B 79, 045109 (2009).
Gresch, D. et al. Automated construction of symmetrized Wannier-like tight-binding models from ab initio calculations. Phys. Rev. Mater. 2, 103805 (2018).
Wang, L. W. & Zunger, A. Electronic structure pseudopotential calculations of large (apprx.1000 Atoms) Si quantum dots. J. Phys. Chem. 98, 2158–2165 (1994).
Wang, L.-W. & Zunger, A. Pseudopotential calculations of nanoscale CdSe quantum dots. Phys. Rev. B 53, 9579–9582 (1996).
Rabani, E., Hetényi, B., Berne, B. J. & Brus, L. E. Electronic properties of CdSe nanocrystals in the absence and presence of a dielectric medium. J. Chem. Phys. 110, 5355–5369 (1999).
Jasrasaria, D., Weinberg, D., Philbin, J. P. & Rabani, E. Simulations of nonradiative processes in semiconductor nanocrystals. J. Chem. Phys. 157, 020901 (2022).
Wood, D. M. & Zunger, A. A new method for diagonalising large matrices. J. Phys. A: Math. Gen. 18, 1343 (1985).
Toledo, S. & Rabani, E. Very large electronic structure calculations using an out-of-core filter-diagonalization method. J. Comput. Phys. 180, 256–269 (2002).
Cullum, J. K. & Willoughby, R. A.Lanczos algorithms for large symmetric eigenvalue computations: Vol. I: Theory (SIAM, 2002).
Fu, H. & Zunger, A. Local-density-derived semiempirical nonlocal pseudopotentials for InP with applications to large quantum dots. Phys. Rev. B 55, 1642–1653 (1997).
Weinberg, D., Park, Y., Limmer, D. T. & Rabani, E. Size-dependent lattice symmetry breaking determines the exciton fine structure of perovskite nanocrystals. Nano Lett. 23, 4997–5003 (2023).
Coley-O’Rourke, M. J., Hou, B., Sherman, S. J., Dukovic, G. & Rabani, E. Intrinsically slow cooling of hot electrons in CdSe nanocrystals compared to CdS. Nano Lett. 25, 244–250 (2025).
Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O. & Walsh, A. Machine learning for molecular and materials science. Nature 559, 547–555 (2018).
Schmidt, J., Marques, M. R. G., Botti, S. & Marques, M. A. L. Recent advances and applications of machine learning in solid-state materials science. npj Comput. Mater. 5, 1–36 (2019).
Keith, J. A. et al. Combining machine learning and computational chemistry for predictive insights into chemical systems. Chem. Rev. 121, 9816–9872 (2021).
Choudhary, K. et al. Recent advances and applications of deep learning methods in materials science. npj Comput. Mater. 8, 1–26 (2022).
Bartók, A. P. et al. Machine learning unifies the modeling of materials and molecules. Sci. Adv. 3, e1701816 (2017).
Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A. & Müller, K.-R. SchNet - A deep learning architecture for molecules and materials. J. Chem. Phys. 148, 241722 (2018).
Musil, F. et al. Physics-inspired structural representations for molecules and materials. Chem. Rev. 121, 9759–9815 (2021).
Deringer, V. L. et al. Gaussian process regression for materials and molecules. Chem. Rev. 121, 10073–10141 (2021).
Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
Zhang, L., Han, J., Wang, H., Car, R. & E, W. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120, 143001 (2018).
Batzner, S. et al. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13, 2453 (2022).
Zeng, J. et al. DeePMD-kit v2: A software package for deep potential models. J. Chem. Phys. 159, 054801 (2023).
Batatia, I., Kovács, D. P., Simm, G. N. C., Ortner, C. & Csányi, G. MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields (2023).
Kim, D., King, D. S., Zhong, P. & Cheng, B. Learning charges and long-range interactions from energies and forces (2024).
Cheng, B. Latent Ewald summation for machine learning of long-range interactions. npj Comput. Mater. 11, 1–8 (2025).
Kulik, H. J. et al. Roadmap on Machine learning in electronic structure. Electron. Struct. 4, 023004 (2022).
Schütt, K. T., Gastegger, M., Tkatchenko, A., Müller, K.-R. & Maurer, R. J. Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. Nat. Commun. 10, 5024 (2019).
Westermayr, J. & J. Maurer, R. Physically inspired deep learning of molecular excitations and photoemission spectra. Chem. Sci. 12, 10755–10764 (2021).
Li, H. et al. Deep-learning density functional theory Hamiltonian for efficient ab initio electronic-structure calculation. Nat. Comput Sci. 2, 367–377 (2022).
Grisafi, A. et al. Transferable machine-learning model of the electron density. ACS Cent. Sci. 5, 57–64 (2019).
Brockherde, F. et al. Bypassing the Kohn-Sham equations with machine learning. Nat. Commun. 8, 872 (2017).
Venturella, C. et al. Unified deep learning framework for many-body quantum chemistry via Green’s functions. Nat. Comput. Sci. 1–12 (2025).
Woo, J., Kim, H. & Kim, W. Y. Neural network-based pseudopotential: development of a transferable local pseudopotential. Phys. Chem. Chem. Phys. 24, 20094–20103 (2022).
Kim, R. & Son, Y.-W. Transferable empirical pseudopotenials from machine learning. Phys. Rev. B 109, 045153 (2024).
Kang, S., Kim, R., Han, S. & Son, Y.-W. Electronic structures of crystalline and amorphous GeSe and GeSbTe compounds using machine learning empirical pseudopotentials (2025).
Wang, Z. et al. Machine learning method for tight-binding Hamiltonian parameterization from ab-initio band structure. npj Comput. Mater. 7, 1–10 (2021).
Schattauer, C., Todorović, M., Ghosh, K., Rinke, P. & Libisch, F. Machine learning sparse tight-binding parameters for defects. npj Comput. Mater. 8, 1–11 (2022).
Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366 (1989).
Hastie, T., Tibshirani, R. & Friedman, J. The elements of statistical learning. Springer series in statistics (Springer New York Inc., New York, NY, USA, 2001).
Weinberg, D. Expanding the Atomistic Study of the Optical and Electronic Properties of Nanomaterials. PhD Thesis, University of California, Berkeley (2023).
Wei, S.-H. & Zunger, A. Predicted band-gap pressure coefficients of all diamond and zinc-blende semiconductors: Chemical trends. Phys. Rev. B 60, 5404–5411 (1999).
Li, Y.-H., Gong, X. G. & Wei, S.-H. Ab initio all-electron calculation of absolute volume deformation potentials of IV-IV, III-V, and II-VI semiconductors: The chemical trends. Phys. Rev. B 73, 245206 (2006).
Souza, I., Marzari, N. & Vanderbilt, D. Maximally localized Wannier functions for entangled energy bands. Phys. Rev. B 65, 035109 (2001).
Fujimoto, Y., Koretsune, T., Saito, S., Miyake, T. & Oshiyama, A. A new crystalline phase of four-fold coordinated silicon and germanium. N. J. Phys. 10, 083001 (2008).
Wu, F., Jun, D., Kan, E. & Li, Z. Density functional predictions of new silicon allotropes: Electronic properties and potential applications to Li-battery anode materials. Solid State Commun. 151, 1228–1230 (2011).
Zunger, A., Wei, S.-H., Ferreira, L. G. & Bernard, J. E. Special quasirandom structures. Phys. Rev. Lett. 65, 353–356 (1990).
Wei, S.-H., Ferreira, L. G., Bernard, J. E. & Zunger, A. Electronic properties of random alloys: Special quasirandom structures. Phys. Rev. B 42, 9622–9649 (1990).
Micic, O. I., Curtis, C. J., Jones, K. M., Sprague, J. R. & Nozik, A. J. Synthesis and Characterization of InP quantum dots. J. Phys. Chem. 98, 4966–4969 (1994).
Mićić, O. I., Sprague, J., Lu, Z. & Nozik, A. J. Highly efficient band-edge emission from InP quantum dots. Appl. Phys. Lett. 68, 3150–3152 (1996).
Guzelian, A. A., Banin, U., Kadavanich, A. V., Peng, X. & Alivisatos, A. P. Colloidal chemical synthesis and characterization of InAs nanocrystal quantum dots. Appl. Phys. Lett. 69, 1432–1434 (1996).
Ondry, J. C. et al. Reductive pathways in molten inorganic salts enable colloidal synthesis of III-V semiconductor nanocrystals. Science 386, 401–407 (2024).
Nomura, S. & Kobayashi, T. Exciton–lo-phonon couplings in spherical semiconductor microcrystallites. Phys. Rev. B 45, 1305–1316 (1992).
Besombes, L., Kheng, K., Marsal, L. & Mariette, H. Acoustic phonon broadening mechanism in single quantum dot emission. Phys. Rev. B 63, 155307 (2001).
Lin, K. et al. Theory of Photoluminescence spectral line shapes of semiconductor nanocrystals. J. Phys. Chem. Lett. 14, 7241–7248 (2023).
Gupta, A. et al. Composition-defined optical properties and the direct-to-indirect transition in Core-Shell In1-xGaxP/ZnS colloidal quantum dots. J. Am. Chem. Soc. 145, 16429–16448 (2023).
Vurgaftman, I., Meyer, J. R. & Ram-Mohan, L. R. Band parameters for III-V compound semiconductors and their alloys. J. Appl. Phys. 89, 5815–5875 (2001).
Kingma, D. P. & Ba, J. Adam: A Method for Stochastic Optimization (2017).
Barron, J. T. Continuously Differentiable Exponential Linear Units (2017).
He, K., Zhang, X., Ren, S. & Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (2015).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Giannozzi, P. et al. QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials. J. Phys.: Condens. Matter 21, 395502 (2009).
Deslippe, J. et al. BerkeleyGW: A massively parallel computer package for the calculation of the quasiparticle and optical properties of materials and nanostructures. Comput. Phys. Commun. 183, 1269–1289 (2012).
Malone, B. D. & Cohen, M. L. Quasiparticle semiconductor band structures including spin-orbit interactions. J. Phys.: Condens. Matter 25, 105503 (2013).
Freeouf, J. L. & Woodall, J. M. Schottky barriers: An effective work function model. Appl. Phys. Lett. 39, 727–729 (1981).
Powell, D., Migliorato, M. A. & Cullis, A. G. Optimized Tersoff potential parameters for tetrahedrally bonded III-V semiconductors. Phys. Rev. B 75, 115202 (2007).
Williamson, A. J. & Zunger, A. Pseudopotential study of electron-hole excitations in colloidal free-standing InAs quantum dots. Phys. Rev. B 61, 1978–1991 (2000).
Jasrasaria, D. & Rabani, E. Interplay of surface and interior modes in exciton-phonon coupling at the nanoscale. Nano Lett. 21, 8741–8748 (2021).
Acknowledgements
We thank Professors Bingqing Cheng and David Limmer for valuable discussions. This work was supported by the National Science Foundation Division of Chemistry, under the Chemical Theory, Models and Computational Methods (CTMC) program, grant number CHE-2449564. Methods used to describe the vibronic properties of NCs were provided by the center on “Traversing the death valley separating short and long times in non-equilibrium quantum dynamical simulations of real materials”, which is funded by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research and Office of Basic Energy Sciences, Scientific Discovery through Advanced Computing (SciDAC) program, under Award No. DE-SC0022088. Measured optical properties of III-V NCs were supported by the National Science Foundation Science and Technology Center (STC) for Integration of Modern Optoelectronic Materials on Demand (IMOD) under Cooperative Agreement No. DMR-2019444. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 using NERSC award BES-ERCAP0032503.
Author information
Authors and Affiliations
Contributions
K.L. and E.R. conceived and designed the project and co-wrote the manuscript. K.L. developed the machine-learned semi-empirical pseudopotential model and associated codebase, performed ab initio GW calculations, trained and evaluated the model, and conducted data analyses. M.J.C. contributed to model implementation and supported manuscript writing. E.R. supervised the research and acquired funding. All authors discussed the results and contributed to manuscript revisions.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lin, K., Coley-O’Rourke, M.J. & Rabani, E. Deep-learning atomistic semi-empirical pseudopotential model for nanomaterials. npj Comput Mater 11, 381 (2025). https://doi.org/10.1038/s41524-025-01862-5
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41524-025-01862-5







