Large language models for reticular chemistry

Zheng, Zhiling; Rampal, Nakul; Inizan, Theo Jaffrelot; Borgs, Christian; Chayes, Jennifer T.; Yaghi, Omar M.

doi:10.1038/s41578-025-00772-8

Perspective
Published: 01 February 2025

Large language models for reticular chemistry

Nature Reviews Materials volume 10, pages 369–381 (2025) Cite this article

12k Accesses
118 Citations
10 Altmetric
Metrics details

Subjects

Abstract

Reticular chemistry is the science of connecting molecular building units into crystalline extended structures such as metal–organic frameworks and covalent organic frameworks. Large language models (LLMs), a type of generative artificial intelligence system, can augment laboratory research in reticular chemistry by helping scientists to extract knowledge from literature, design materials and collect and interpret experimental data — ultimately accelerating scientific discovery. In this Perspective, we explore the concepts and methods used to apply LLMs in research, including prompt engineering, knowledge and tool augmentation and fine-tuning. We discuss how ‘chemistry-aware’ models can be tailored to specific tasks and integrated into existing practices of reticular chemistry, transforming the traditional ‘make, characterize, use’ protocol driven by empirical knowledge into a discovery cycle based on finding synthesis–structure–property–performance relationships. Furthermore, we explore how modular LLM agents can be integrated into multi-agent laboratory systems, such as self-driving robotic laboratories, to streamline labour-intensive tasks and collaborate with chemists and how LLMs can lower the barriers to applying generative artificial intelligence and data-driven workflows to such challenging research questions as crystallization. This contribution equips both computational and experimental chemists with the insights necessary to harness LLMs for materials discovery in reticular chemistry and, more broadly, materials science.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to the full article PDF.

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Progress in reticular chemistry over the past three decades.**

**Fig. 2: Overview of key concepts in leveraging large language models for reticular chemistry.**

**Fig. 3: Overview of key steps in data mining from scientific literature.**

**Fig. 4: Examples of MOF synthesis parameters extracted from literature.**

**Fig. 5: Generating molecular building block structures using fine-tuned large language models.**

**Fig. 6: The roles LLMs can take in a data-driven selection process for reticular frameworks.**

Verification and execution of the scientific literature via chemputation augmented by large language models

Article Open access 03 April 2026

Leveraging large language models for predictive chemistry

Article Open access 06 February 2024

Evaluation guidelines for machine learning tools in the chemical sciences

Article 24 May 2022

References

Yaghi, O. M. et al. Reticular synthesis and the design of new materials. Nature 423, 705–714 (2003).
CAS PubMed Google Scholar
Lyu, H., Ji, Z., Wuttke, S. & Yaghi, O. M. Digital reticular chemistry. Chem 6, 2219–2241 (2020).
CAS Google Scholar
Moosavi, S. M. et al. Understanding the diversity of the metal–organic framework ecosystem. Nat. Commun. 11, 4068 (2020).
CAS PubMed PubMed Central Google Scholar
Jablonka, K. M., Rosen, A. S., Krishnapriyan, A. S. & Smit, B. An ecosystem for digital reticular chemistry. ACS Cent. Sci. 9, 563–581 (2023).
CAS PubMed PubMed Central Google Scholar
Yaghi, O. M. & Zheng, Z. Reticular chemistry and new materials. In 26th Int. Solvay Conf. Chem. Chem. Chall. 21st Century (eds Wüthrich, K., Feringa, B. L., Rongy, L. & De Wit, A.) 155–160 (World Scientific, 2024).
Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).
CAS PubMed Google Scholar
Gupta, P., Ding, B., Guan, C. & Ding, D. Generative AI: a systematic review using topic modelling techniques. Data Inf. Manag. 8, 100066 (2024).
Google Scholar
Bandi, A., Adapa, P. V. S. R. & Kuchi, Y. E. V. P. K. The power of generative AI: a review of requirements, models, input–output formats, evaluation metrics, and challenges. Future Internet 15, 260 (2023).
Google Scholar
Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at https://arxiv.org/abs/2303.12712 (2023).
Walters, W. P. & Murcko, M. Assessing the impact of generative AI on medicinal chemistry. Nat. Biotechnol. 38, 143–145 (2020).
CAS PubMed Google Scholar
Ren, Z., Ren, Z., Zhang, Z., Buonassisi, T. & Li, J. Autonomous experiments using active learning and AI. Nat. Rev. Mater. 8, 563–564 (2023).
Google Scholar
Microsoft Research AI4Science & Microsoft Azure Quantum. The impact of large language models on scientific discovery: a preliminary study using GPT-4. Preprint at https://arxiv.org/abs/2311.07361 (2023).
White, A. D. The future of chemistry is language. Nat. Rev. Chem. 7, 457–458 (2023).
CAS PubMed Google Scholar
Lála, J. et al. PaperQA: retrieval-augmented generative agent for scientific research. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2023).
Zheng, Z., Zhang, O., Borgs, C., Chayes, J. T. & Yaghi, O. M. ChatGPT chemistry assistant for text mining and the prediction of MOF synthesis. J. Am. Chem. Soc. 145, 18048–18062 (2023).
CAS PubMed PubMed Central Google Scholar
Bran, A. M. et al. Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6, 525–535 (2024).
Google Scholar
OpenAI et al. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).
Ouyang, L. et al. Training language models to follow instructions with human feedback. In 36th Conf. Neural Inform. Process. Syst. (Morgan Kaufmann, 2022).
Gemini Team et al. Gemini: a family of highly capable multimodal models. Preprint at https://arxiv.org/abs/2312.11805 (2023).
Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).
Touvron, H. et al. LLaMA 2: open foundation and fine-tuned chat models. Preprint at https://arxiv.org/abs/2307.09288 (2023).
Vaswani, A. et al. Attention is all you need. In 31st Conf. Neural Inform. Process. Syst. (Curran Associates, 2017).
Wei, J. et al. Emergent abilities of large language models. Trans. Mach. Learn. Res. https://openreview.net/forum?id=yzkSU5zdwD (2022).
Xu, P., Zhu, X. & Clifton, D. A. Multimodal learning with transformers: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 12113–12132 (2023).
PubMed Google Scholar
Zhang, D. et al. MM-LLMs: recent advances in multimodal large language models. In Find. Assoc. Comput. Linguist. 12401–12430 (ACL, 2024).
Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th Int. Conf. Machine Learning 8748–8763 (PMLR, 2021).
Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. In 37th Conf. Neural Inform. Process. Syst. (NeurIPS, 2023).
Yang, Z. et al. The dawn of LMMs: preliminary explorations with GPT-4V(ision). Preprint at https://arxiv.org/abs/2309.17421 (2023).
Zheng, Z. et al. Image and data mining in reticular chemistry powered by GPT-4V. Digit. Discov. 3, 491–501 (2024).
Google Scholar
Zhao, W. X. et al. A survey of large language models. Preprint at https://arxiv.org/abs/2303.18223 (2023).
Naveed, H. et al. A comprehensive overview of large language models. Preprint at https://arxiv.org/abs/2307.06435 (2024).
Ramos, M. C., Collison, C. J. & White, A. D. A review of large language models and autonomous agents in chemistry. Preprint at https://arxiv.org/abs/2407.01603 (2024).
Lei, G., Docherty, R. & Cooper, S. J. Materials science in the era of large language models: a perspective. Digit. Discov. 3, 1257–1272 (2024).
Google Scholar
Min, B. et al. Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput. Surv. 56, 1–40 (2024).
Google Scholar
Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inf. Process. Syst. 35, 24824–24837 (2022).
Google Scholar
Dong, Q. et al. A survey on in-context learning. Preprint at https://arxiv.org/abs/2301.00234 (2024).
Huang, J. & Chang, K. C.-C. Towards reasoning in large language models: a survey. In Find. Assoc. Comput. Linguist. 1049–1065 (ACL, 2023).
Zheng, Z. et al. A GPT-4 reticular chemist for guiding MOF discovery. Angew. Chem. Int. Ed. 62, e202311983 (2023).
CAS Google Scholar
Maik Jablonka, K. et al. 14 examples of how LLMs can transform materials science and chemistry: a reflection on a large language model hackathon. Digit. Discov. 2, 1233–1250 (2023).
Google Scholar
Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In Proc. 58th Annu. Meet. Assoc. Comput. Linguist. 1906–1919 (ACL, 2020).
Zheng, Z. et al. Shaping the water-harvesting behavior of metal–organic frameworks aided by fine-tuned GPT models. J. Am. Chem. Soc. 145, 28284–28295 (2023).
CAS PubMed Google Scholar
Zheng, Z. et al. ChatGPT research group for optimizing the crystallinity of MOFs and COFs. ACS Cent. Sci. 9, 2161–2170 (2023).
CAS PubMed PubMed Central Google Scholar
Chung, H. W. et al. Scaling instruction-finetuned language models. J. Mach. Learn. Res. 25, 1–53 (2024).
Google Scholar
Wang, Y. et al. Super-natural instructions: generalization via declarative instructions on 1600+ NLP tasks. In Proc. 2022 Conf. Empir. Methods Nat. Lang. Process. 5085–5109 (ACL, 2022).
Kim, S. et al. The CoT collection: improving zero-shot and few-shot learning of language models via chain-of-thought fine-tuning. In Proc. 2023 Conf. Empir. Methods Nat. Lang. Process. (ACL, 2023).
Yao, S. et al. Tree of thoughts: deliberate problem solving with large language models. In 37th Conf. Neural Inform. Process. Syst. (NeurIPS, 2023).
Khattab, O. et al. DSPy: compiling declarative language model calls into self-improving pipelines. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2024).
Wang, X. et al. Self-consistency improves chain of thought reasoning in language models. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2024).
Ji, Z. et al. Towards mitigating LLM hallucination via self reflection. In Find. Assoc. Comput. Linguist. (eds Bouamor, H., Pino, J. & Bali, K.) 1827–1843 (ACL, 2023).
Asai, A., Wu, Z., Wang, Y., Sil, A. & Hajishirzi, H. Self-RAG: learning to retrieve, generate, and critique through self-reflection. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2023).
Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).
CAS PubMed PubMed Central Google Scholar
Kang, Y. & Kim, J. ChatMOF: an artificial intelligence system for predicting and generating metal–organic frameworks using large language models. Nat. Commun. 15, 4705 (2024).
CAS PubMed PubMed Central Google Scholar
Ruan, Y. et al. An automatic end-to-end chemical synthesis development platform powered by large language models. Nat. Commun. 15, 10160 (2024).
CAS PubMed PubMed Central Google Scholar
Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Inform. Process. Syst. Vol. 33 9459–9474 (Curran Associates, 2020).
Gao, Y. et al. Retrieval-augmented generation for large language models: a survey. Preprint at https://arxiv.org/abs/2312.10997 (2024).
Liu, N. F. et al. Lost in the middle: how language models use long contexts. Trans. Assoc. Comput. Linguist. 12, 157–173 (2024).
Google Scholar
Ruan, Y. et al. Accelerated end-to-end chemical synthesis development with large language models. Preprint at https://doi.org/10.26434/chemrxiv-2024-6wmg4 (2024).
Jablonka, K. M., Schwaller, P., Ortega-Guerrero, A. & Smit, B. Leveraging large language models for predictive chemistry. Nat. Mach. Intell. 6, 161–169 (2024).
Google Scholar
Gupta, T., Zaki, M., Krishnan, N. M. A. & Mausam MatSciBERT: a materials domain language model for text mining and information extraction. npj Comput. Mater. 8, 1–11 (2022).
Google Scholar
Antunes, L. M., Butler, K. T. & Grau-Crespo, R. Crystal structure generation with autoregressive large language modeling. Nat. Commun. 15, 10570 (2024).
CAS PubMed PubMed Central Google Scholar
Gruver, N. et al. Fine-tuned language models generate stable inorganic materials as text. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2024).
Kim, S., Jung, Y. & Schrier, J. Large language models for inorganic synthesis predictions. J. Am. Chem. Soc. 146, 19654–19659 (2024).
CAS PubMed Google Scholar
Zhang, W. et al. Fine-tuning large language models for chemical text mining. Chem. Sci. 15, 10600–10611 (2024).
CAS PubMed PubMed Central Google Scholar
Jiang, A. Q. et al. Mistral 7B. Preprint at https://arxiv.org/abs/2310.06825v1 (2023).
Beltagy, I., Lo, K. & Cohan, A. SciBERT: a pretrained language model for scientific text. Preprint at https://arxiv.org/abs/1903.10676 (2019).
Lewis, M. et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proc. 58th Annu. Meet. Assoc. Comput. Linguist. (ACL, 2020).
Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 1–67 (2020).
Google Scholar
Hu, E. J. et al. Lora: low-rank adaptation of large language models. In Proc. 10th Int. Conf. Learn. Represent. (ICLR, 2021).
Han, Z., Gao, C., Liu, J., Zhang, J. & Zhang, S. Q. Parameter-efficient fine-tuning for large models: a comprehensive survey. Trans. Mach. Learn. Res. https://openreview.net/forum?id=lIsCS8b6zj (2024).
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
CAS Google Scholar
Bai, X., Xie, Y., Zhang, X., Han, H. & Li, J.-R. Evaluation of open-source large language models for metal–organic frameworks research. J. Chem. Inf. Model. 64, 4958–4965 (2024).
CAS PubMed Google Scholar
Luo, Y. et al. MOF synthesis prediction enabled by automatic data mining and machine learning. Angew. Chem. Int. Ed. 61, e202200242 (2022).
CAS Google Scholar
Park, H., Kang, Y., Choe, W. & Kim, J. Mining insights on metal–organic framework synthesis from scientific literature texts. J. Chem. Inf. Model. 62, 1190–1198 (2022).
CAS PubMed Google Scholar
Glasby, L. T. et al. DigiMOF: a database of metal–organic framework synthesis information generated via text mining. Chem. Mater. 35, 4510–4524 (2023).
CAS PubMed PubMed Central Google Scholar
Park, S. et al. Text mining metal–organic framework papers. J. Chem. Inf. Model. 58, 244–251 (2018).
CAS PubMed Google Scholar
Nandy, A. et al. MOFSimplify, machine learning models with extracted stability data of three thousand metal–organic frameworks. Sci. Data 9, 74 (2022).
CAS PubMed PubMed Central Google Scholar
Nandy, A., Duan, C. & Kulik, H. J. Using machine learning and data mining to leverage community knowledge for the engineering of stable metal–organic frameworks. J. Am. Chem. Soc. 143, 17535–17547 (2021).
CAS PubMed Google Scholar
Batra, R., Chen, C., Evans, T. G., Walton, K. S. & Ramprasad, R. Prediction of water stability of metal–organic frameworks using machine learning. Nat. Mach. Intell. 2, 704–710 (2020).
Google Scholar
Terrones, G. G. et al. Metal–organic framework stability in water and harsh environments from data-driven models trained on the diverse WS24 data set. J. Am. Chem. Soc. 146, 20333–20348 (2024).
CAS PubMed Google Scholar
Lee, W., Kang, Y., Bae, T. & Kim, J. Harnessing large language model to collect and analyze metal-organic framework property dataset. Preprint at https://arxiv.org/abs/2404.13053.
Rampal, N. et al. Single and multi-hop question-answering datasets for reticular chemistry with GPT-4-turbo. J. Chem. Theory Comput. 20, 9128–9137 (2024).
CAS PubMed Google Scholar
Ansari, M. & Moosavi, S. M. Agent-based learning of materials datasets from the scientific literature. Digit. Discov. 3, 2607–2617 (2024).
Google Scholar
Leong, S. X., Pablo-García, S., Zhang, Z. & Aspuru-Guzik, A. Automated electrosynthesis reaction mining with multimodal large language models (MLLMs). Preprint at https://doi.org/10.26434/chemrxiv-2024-7fwxv (2024).
Liu, S. et al. Conversational Drug Editing Using Retrieval and Domain Feedback. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2024).
Ahn, J. et al. Large language models for mathematical reasoning: progresses and challenges. In Proc. 18th Conf. Eur. Ch. Assoc. Comput. Linguist. 225-237 (ACL, 2024)
Pinheiro, M., Martin, R. L., Rycroft, C. H. & Haranczyk, M. High accuracy geometric analysis of crystalline porous materials. CrystEngComm 15, 7531–7538 (2013).
CAS Google Scholar
Willems, T. F., Rycroft, C. H., Kazi, M., Meza, J. C. & Haranczyk, M. Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials. Microporous Mesoporous Mater. 149, 134–141 (2012).
CAS Google Scholar
Pinheiro, M. et al. Characterization and comparison of pore landscapes in crystalline porous materials. J. Mol. Graph. Model. 44, 208–219 (2013).
CAS PubMed Google Scholar
Sarkisov, L. & Harrison, A. Computational structure characterisation tools in application to ordered and disordered porous materials. Mol. Simul. 37, 1248–1257 (2011).
CAS Google Scholar
Sarkisov, L. & Kim, J. Computational structure characterization tools for the era of material informatics. Chem. Eng. Sci. 121, 322–330 (2015).
CAS Google Scholar
Dubbeldam, D., Calero, S., Ellis, D. E. & Snurr, R. Q. RASPA: molecular simulation software for adsorption and diffusion in flexible nanoporous materials. Mol. Simul. 42, 81–101 (2016).
CAS Google Scholar
Su, Y. et al. Automation and machine learning augmented by large language models in a catalysis study. Chem. Sci. 15, 12200–12233 (2024).
CAS PubMed PubMed Central Google Scholar
Zheng, Z. et al. Integrating machine learning and large language models to advance exploration of electrochemical reactions. Angew. Chem. Int. Ed. 63, e202418074 (2024).
Google Scholar
Mahjour, B., Hoffstadt, J. & Cernak, T. Designing chemical reaction arrays using phactor and ChatGPT. Org. Process Res. Dev. 27, 1510–1516 (2023).
CAS Google Scholar
Chiang, W.-L. et al. Chatbot arena: an open platform for evaluating LLMs by human preference. In Proc. 41st Int. Conf. Mach. Learn. (ICML, 2024).
An, Y. et al. Knowledge graph question answering for materials science (KGQA4MAT): developing natural language interface for metal-organic frameworks knowledge graph (MOF-KG) using LLM. In 17th Int. Conf. Metadata Semantics Res. (Springer, 2023).
Shi, L. et al. LLM-based MOFs synthesis condition extraction using few-shot demonstrations. Preprint at https://arxiv.org/abs/2408.04665 (2024).
Rubungo, A. N., Li, K., Hattrick-Simpers, J. & Dieng, A. B. LLM4Mat-bench: benchmarking large language models for materials property prediction. Preprint at https://arxiv.org/abs/2411.00177 (2024).
de Vries, A. The growing energy footprint of artificial intelligence. Joule 7, 2191–2194 (2023).
Google Scholar
Wu, C.-J. et al. Sustainable AI: environmental implications, challenges and opportunities. Preprint at https://arxiv.org/abs/2111.00364 (2022).
Xu, M. et al. A survey of resource-efficient LLM and multimodal foundation models. Preprint at https://arxiv.org/abs/2401.08092 (2024).
Stojkovic, J., Choukse, E., Zhang, C., Goiri, I. & Torrellas, J. Towards greener LLMs: bringing energy-efficiency to the forefront of LLM inference. Preprint at https://arxiv.org/abs/2403.20306 (2024).
Morris, M. R. et al. Levels of AGI: operationalizing progress on the path to AGI. Preprint at https://arxiv.org/abs/2311.02462 (2024).
Gropp, C. et al. Standard practices of reticular chemistry. ACS Cent. Sci. 6, 1255–1273 (2020).
CAS PubMed PubMed Central Google Scholar
Li, A. et al. The launch of a freely accessible MOF CIF collection from the CSD. Matter 4, 1105–1106 (2021).
CAS Google Scholar
Yaghi, O. M., Li, G. & Li, H. Selective binding and removal of guests in a microporous metal–organic framework. Nature 378, 703–706 (1995).
CAS Google Scholar
Côté, A. P. et al. Porous, crystalline, covalent organic frameworks. Science 310, 1166–1170 (2005).
PubMed Google Scholar
Park, K. S. et al. Exceptional chemical and thermal stability of zeolitic imidazolate frameworks. Proc. Natl Acad. Sci. USA 103, 10186–10191 (2006).
CAS PubMed PubMed Central Google Scholar
Deng, H. et al. Multiple functional groups of varying ratios in metal–organic frameworks. Science 327, 846–850 (2010).
CAS PubMed Google Scholar
Liu, Y. et al. Weaving of organic threads into a crystalline covalent organic framework. Science 351, 365–369 (2016).
CAS PubMed Google Scholar
El-Kaderi, H. M. et al. Designed synthesis of 3D covalent organic frameworks. Science 316, 268–272 (2007).
CAS PubMed Google Scholar
Yang, J. et al. Principles of designing extra-large pore openings and cages in zeolitic imidazolate frameworks. J. Am. Chem. Soc. 139, 6448–6455 (2017).
CAS PubMed Google Scholar
Cmarik, G. E., Kim, M., Cohen, S. M. & Walton, K. S. Tuning the adsorption properties of UiO-66 via ligand functionalization. Langmuir 28, 15606–15613 (2012).
CAS PubMed Google Scholar
Wang, Z. & Cohen, S. M. Postsynthetic covalent modification of a neutral metal−organic framework. J. Am. Chem. Soc. 129, 12368–12369 (2007).
CAS PubMed Google Scholar
Li, H., Eddaoudi, M., O’Keeffe, M. & Yaghi, O. M. Design and synthesis of an exceptionally stable and highly porous metal–organic framework. Nature 402, 276–279 (1999).
CAS Google Scholar
Seo, J. S. et al. A homochiral metal–organic porous material for enantioselective separation and catalysis. Nature 404, 982–986 (2000).
CAS PubMed Google Scholar
Ni, Z. & Masel, R. I. Rapid production of metal−organic frameworks via microwave-assisted solvothermal synthesis. J. Am. Chem. Soc. 128, 12394–12395 (2006).
CAS PubMed Google Scholar
Pichon, A., Lazuen-Garay, A. & James, S. L. Solvent-free synthesis of a microporous metal–organic framework. CrystEngComm 8, 211–214 (2006).
CAS Google Scholar
Wilmer, C. E. et al. Large-scale screening of hypothetical metal–organic frameworks. Nat. Chem. 4, 83–89 (2012).
CAS Google Scholar
Chung, Y. G. et al. Computation-ready, experimental metal–organic frameworks: a tool to enable high-throughput screening of nanoporous crystals. Chem. Mater. 26, 6185–6192 (2014).
CAS Google Scholar
Bobbitt, N. S. et al. MOFX-DB: an online database of computational adsorption data for nanoporous materials. J. Chem. Eng. Data 68, 483–498 (2023).
CAS Google Scholar
Chung, Y. G. et al. Advances, updates, and analytics for the computation-ready, experimental metal–organic framework database: CoRE MOF 2019. J. Chem. Eng. Data 64, 5985–5998 (2019).
CAS Google Scholar
Rosen, A. S. et al. Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery. Matter 4, 1578–1597 (2021).
CAS Google Scholar
Rosi, N. L. et al. Hydrogen storage in microporous metal-organic frameworks. Science 300, 1127–1129 (2003).
CAS PubMed Google Scholar
Millward, A. R. & Yaghi, O. M. Metal−organic frameworks with exceptionally high capacity for storage of carbon dioxide at room temperature. J. Am. Chem. Soc. 127, 17998–17999 (2005).
CAS PubMed Google Scholar
Horcajada, P. et al. Metal–organic frameworks as efficient materials for drug delivery. Angew. Chem. Int. Ed. 45, 5974–5978 (2006).
CAS Google Scholar
Feng, D. et al. Zirconium-metalloporphyrin PCN-222: mesoporous metal–organic frameworks with ultrahigh stability as biomimetic catalysts. Angew. Chem. Int. Ed. 51, 10307 (2012).
CAS Google Scholar
Furukawa, H. et al. Water adsorption in porous metal–organic frameworks and related materials. J. Am. Chem. Soc. 136, 4369–4381 (2014).
CAS PubMed Google Scholar
Zhou, Z. et al. Carbon dioxide capture from open air using covalent organic frameworks. Nature 635, 96–101 (2024).
CAS PubMed Google Scholar
Sheberla, D. et al. Conductive MOF electrodes for stable supercapacitors with high areal capacitance. Nat. Mater. 16, 220–224 (2017).
CAS PubMed Google Scholar

Download references

Acknowledgements

The authors thank the Defense Advanced Research Projects Agency (DARPA) for the financial support under contract HR0011-21-C-0020. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA.

Author information

Authors and Affiliations

Department of Chemistry, University of California — Berkeley, Berkeley, CA, USA
Zhiling Zheng (郑志凌), Nakul Rampal, Theo Jaffrelot Inizan & Omar M. Yaghi
Bakar Institute of Digital Materials for the Planet, Berkeley, CA, USA
Zhiling Zheng (郑志凌), Nakul Rampal, Theo Jaffrelot Inizan, Christian Borgs, Jennifer T. Chayes & Omar M. Yaghi
Department of Electrical Engineering and Computer Sciences, University of California — Berkeley, Berkeley, CA, USA
Zhiling Zheng (郑志凌), Nakul Rampal, Theo Jaffrelot Inizan, Christian Borgs & Jennifer T. Chayes
Department of Mathematics, University of California — Berkeley, Berkeley, CA, USA
Jennifer T. Chayes
Department of Statistics, University of California — Berkeley, Berkeley, CA, USA
Jennifer T. Chayes
School of Information, University of California — Berkeley, Berkeley, CA, USA
Jennifer T. Chayes
KACST–UC Berkeley Center of Excellence for Nanomaterials for Clean Energy Applications, King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia
Omar M. Yaghi

Authors

Zhiling Zheng (郑志凌)
View author publications
Search author on:PubMed Google Scholar
Nakul Rampal
View author publications
Search author on:PubMed Google Scholar
Theo Jaffrelot Inizan
View author publications
Search author on:PubMed Google Scholar
Christian Borgs
View author publications
Search author on:PubMed Google Scholar
Jennifer T. Chayes
View author publications
Search author on:PubMed Google Scholar
Omar M. Yaghi
View author publications
Search author on:PubMed Google Scholar

Contributions

O.M.Y. contributed to writing, review and editing, and supervision. Z.Z. contributed to writing, review and editing, and graphics. N.R., T.J.I., J.T.C. and C.B. contributed to review and editing. Z.Z. used ChatGPT-4 for grammar and typos checking during the review and editing of this manuscript. All authors have read, corrected and verified all information presented in this manuscript.

Corresponding author

Correspondence to Omar M. Yaghi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Materials thanks Seyed Mohamad Moosavi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

API: (Application Programming Interface). A set of rules and protocols that allow different software applications to communicate and share data or commands.
CIFs: (Crystallographic Information Files). A standardized text file format that records crystal structure data, including atomic coordinates and unit cell parameters, enabling consistent sharing of crystal structures.
JSON: (JavaScript Object Notation). A lightweight, text-based format used to structure, store and transfer data between systems in a human-readable manner.
LLM-in-the-loop: A workflow in which a large language model (LLM) continuously participates and provides input, just as a human expert would in a ‘human-in-the-loop’ scenario. The agent may propose actions, analyse data or suggest refinements, and then adapt its guidance based on feedback from experimental results, computational tools or human researchers.
Neural networks: A computational model inspired by the structure of the human brain, composed of layers of interconnected nodes (neurons) that process and learn patterns from data.
SMILES: (Simplified Molecular Input Line Entry System). A textual notation for representing chemical structures, allowing for easy storage, manipulation and computational handling of molecular information.
Tokens: The smallest units of text (such as words, parts of words or symbols) that a language model processes and generates during text analysis and processing.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zheng, Z., Rampal, N., Inizan, T.J. et al. Large language models for reticular chemistry. Nat Rev Mater 10, 369–381 (2025). https://doi.org/10.1038/s41578-025-00772-8

Download citation

Accepted: 31 December 2024
Published: 01 February 2025
Version of record: 01 February 2025
Issue date: May 2025
DOI: https://doi.org/10.1038/s41578-025-00772-8

This article is cited by

Synthesis of covalent organic frameworks for photocatalytic hydrogen peroxide production guided by large language models
- Chang Shu
- Ledu Wang
- Xiaoyan Wang
Nature Communications (2026)
Carbon capture with COFs
- Liang Zhang
- Jun Jiang
Nature Reviews Chemistry (2026)
Building an end-to-end battery recipe knowledge base via transformer-based text mining
- Daeun Lee
- Hiroshi Mizuseki
- Byungju Lee
Communications Materials (2025)
Towards domain-adapted large language models for water and wastewater management: methods, datasets and benchmarking
- Boyan Xu
- Guanlan Wu
- How Yong Ng
npj Clean Water (2025)
DeepSeek-LLM with Adaptive RAG for Pharmaceutical Dissolution Prediction
- Leqi Lin
- Xingyu Zhou
- Xizhong Chen
Pharmaceutical Research (2025)

Large language models for reticular chemistry

Subjects

Abstract

Access options

Similar content being viewed by others

Verification and execution of the scientific literature via chemputation augmented by large language models

Leveraging large language models for predictive chemistry

Evaluation guidelines for machine learning tools in the chemical sciences

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Glossary

Rights and permissions

About this article

Cite this article

This article is cited by

Synthesis of covalent organic frameworks for photocatalytic hydrogen peroxide production guided by large language models

Carbon capture with COFs

Building an end-to-end battery recipe knowledge base via transformer-based text mining

Towards domain-adapted large language models for water and wastewater management: methods, datasets and benchmarking

DeepSeek-LLM with Adaptive RAG for Pharmaceutical Dissolution Prediction

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Glossary

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links