
The model is trained on a large corpus of peer-reviewed publications, crystallographic information files, a strategic subset of the open dataset RedPajama, and the Materials Science Community Discourse (MatSci comm.). Credit: Ahlawat, D. et al. Nature Machine Intelligence (2026).
Large language models (LLMs) are increasingly explored as tools to accelerate scientific discovery, but their effectiveness in specialised research domains remains uncertain. A new study introduces LLaMat, a family of language models designed specifically for materials science, showing that domain-adapted AI systems can outperform general-purpose models on key scientific tasks1.
Researchers at Indian Institute of Technology Delhi and collaborators developed LLaMat by continuing the pretraining of base language models on a curated corpus of roughly 30 billion tokens. These tokens were drawn from about four million materials-science publications, crystallographic information files (CIFs), and community discussions. The models were then instruction- and task-tuned using more than 175,000 materials-science question–answer pairs along with multiple benchmark datasets.
When evaluated across 42 tasks — including natural-language understanding, structured information extraction, and crystal structure generation — the models outperformed several widely used commercial LLMs (Claude, GPT and Gemini) while retaining strong general capabilities.
The researchers also uncovered an unexpected phenomenon they call “adaptation rigidity.” Models that had undergone extensive general-purpose pretraining were less able to absorb specialised domain knowledge through continued training. In other words, the most heavily trained models may become comparatively resistant to later specialisation.
The findings suggest that moderately sized, domain-specific LLMs may serve as more effective and computationally efficient AI copilots for scientific research than very large, general-purpose systems.