Fig. 1: General methodology.
From: Physics-informed transformers for electronic quantum states

a First, we choose an effective theory \({\hat{H}}_{0}\) approximating the target Hamiltonian \(\hat{H}\), e.g., via a mean-field approximation or by taking the strong-coupling limit. We use the groundstate \(\left\vert {{{\rm{RS}}}}\right\rangle\) and excited states \(\left\vert {{{\boldsymbol{s}}}}\right\rangle\) of \({\hat{H}}_{0}\) to define a physics-informed, interpretable basis for the Transformer (b) in Equation (4); as long as the dominant weight of the ground state of \(\hat{H}\) is in the low-energy part of the spectrum E0(s) of \({\hat{H}}_{0}\), this further improves sampling efficiency and the expressivity of the ansatz. c We sample the states s using the batch-autoregressive sampler57,58,63. It is controlled by the batch size Ns and the number of partial unique strings nU, and directly produces the relative frequencies r(s) associated with each state in a tree structure format. Back to (b), the states s are then mapped to a high-dimensional representation of size demb and passed through Ndec decoder-layers26, containing Nh attention heads, which produce correspondent representations \({{{\boldsymbol{h}}}}\left({{{\boldsymbol{s}}}}\right)\in {{\mathbb{R}}}^{{d}_{{{{\rm{emb}}}}}}\) in latent space. In Supplementary Note B6 we explain how these parameters are chosen. As discussed in the main text, the wavefunctions \({\psi }_{{{{\boldsymbol{\theta }}}}}({{{\boldsymbol{s}}}})=\sqrt{{q}_{{{{\boldsymbol{\theta }}}}}({{{\boldsymbol{s}}}})}{e}^{i{\phi }_{{{{\boldsymbol{\theta }}}}}({{{\boldsymbol{s}}}})}\) can be directly obtained from these vectors. A new set of states \({{{\mathcal{C}}}}\) is then obtained, according to the updated qθ(s), and the process is repeated until the convergence of {θ, α} according to Equation (5).