Large language models (LLM) can tackle complex mathematical and scientific reasoning tasks. The authors show that, guided by carefully designed prompts, LLM can achieve high accuracy in carrying out analytical calculations in theoretical physics - the derivation of Hartree-Fock equations - with an average score of 87.5 in GPT-4 across calculation steps from recent research papers.
- Haining Pan
- Nayantara Mudur
- Eun-Ah Kim