A major bottleneck in the use of machine learning in materials discovery is the need for high quality material datasets for training. Here, Itani and coauthors introduce NEMAD, the northeast materials database, collated from the scientific corpus of articles using Large Language Model to extract the relevant magnetic, crystallographic and chemical data.
- Suman Itani
- Yibo Zhang
- Jiadong Zang