Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Large language models for reticular chemistry

Abstract

Reticular chemistry is the science of connecting molecular building units into crystalline extended structures such as metal–organic frameworks and covalent organic frameworks. Large language models (LLMs), a type of generative artificial intelligence system, can augment laboratory research in reticular chemistry by helping scientists to extract knowledge from literature, design materials and collect and interpret experimental data — ultimately accelerating scientific discovery. In this Perspective, we explore the concepts and methods used to apply LLMs in research, including prompt engineering, knowledge and tool augmentation and fine-tuning. We discuss how ‘chemistry-aware’ models can be tailored to specific tasks and integrated into existing practices of reticular chemistry, transforming the traditional ‘make, characterize, use’ protocol driven by empirical knowledge into a discovery cycle based on finding synthesis–structure–property–performance relationships. Furthermore, we explore how modular LLM agents can be integrated into multi-agent laboratory systems, such as self-driving robotic laboratories, to streamline labour-intensive tasks and collaborate with chemists and how LLMs can lower the barriers to applying generative artificial intelligence and data-driven workflows to such challenging research questions as crystallization. This contribution equips both computational and experimental chemists with the insights necessary to harness LLMs for materials discovery in reticular chemistry and, more broadly, materials science.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Progress in reticular chemistry over the past three decades.
The alternative text for this image may have been generated using AI.
Fig. 2: Overview of key concepts in leveraging large language models for reticular chemistry.
The alternative text for this image may have been generated using AI.
Fig. 3: Overview of key steps in data mining from scientific literature.
The alternative text for this image may have been generated using AI.
Fig. 4: Examples of MOF synthesis parameters extracted from literature.
The alternative text for this image may have been generated using AI.
Fig. 5: Generating molecular building block structures using fine-tuned large language models.
The alternative text for this image may have been generated using AI.
Fig. 6: The roles LLMs can take in a data-driven selection process for reticular frameworks.
The alternative text for this image may have been generated using AI.

Similar content being viewed by others

References

  1. Yaghi, O. M. et al. Reticular synthesis and the design of new materials. Nature 423, 705–714 (2003).

    CAS  PubMed  Google Scholar 

  2. Lyu, H., Ji, Z., Wuttke, S. & Yaghi, O. M. Digital reticular chemistry. Chem 6, 2219–2241 (2020).

    CAS  Google Scholar 

  3. Moosavi, S. M. et al. Understanding the diversity of the metal–organic framework ecosystem. Nat. Commun. 11, 4068 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Jablonka, K. M., Rosen, A. S., Krishnapriyan, A. S. & Smit, B. An ecosystem for digital reticular chemistry. ACS Cent. Sci. 9, 563–581 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Yaghi, O. M. & Zheng, Z. Reticular chemistry and new materials. In 26th Int. Solvay Conf. Chem. Chem. Chall. 21st Century (eds Wüthrich, K., Feringa, B. L., Rongy, L. & De Wit, A.) 155–160 (World Scientific, 2024).

  6. Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).

    CAS  PubMed  Google Scholar 

  7. Gupta, P., Ding, B., Guan, C. & Ding, D. Generative AI: a systematic review using topic modelling techniques. Data Inf. Manag. 8, 100066 (2024).

    Google Scholar 

  8. Bandi, A., Adapa, P. V. S. R. & Kuchi, Y. E. V. P. K. The power of generative AI: a review of requirements, models, input–output formats, evaluation metrics, and challenges. Future Internet 15, 260 (2023).

    Google Scholar 

  9. Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at https://arxiv.org/abs/2303.12712 (2023).

  10. Walters, W. P. & Murcko, M. Assessing the impact of generative AI on medicinal chemistry. Nat. Biotechnol. 38, 143–145 (2020).

    CAS  PubMed  Google Scholar 

  11. Ren, Z., Ren, Z., Zhang, Z., Buonassisi, T. & Li, J. Autonomous experiments using active learning and AI. Nat. Rev. Mater. 8, 563–564 (2023).

    Google Scholar 

  12. Microsoft Research AI4Science & Microsoft Azure Quantum. The impact of large language models on scientific discovery: a preliminary study using GPT-4. Preprint at https://arxiv.org/abs/2311.07361 (2023).

  13. White, A. D. The future of chemistry is language. Nat. Rev. Chem. 7, 457–458 (2023).

    CAS  PubMed  Google Scholar 

  14. Lála, J. et al. PaperQA: retrieval-augmented generative agent for scientific research. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2023).

  15. Zheng, Z., Zhang, O., Borgs, C., Chayes, J. T. & Yaghi, O. M. ChatGPT chemistry assistant for text mining and the prediction of MOF synthesis. J. Am. Chem. Soc. 145, 18048–18062 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Bran, A. M. et al. Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6, 525–535 (2024).

    Google Scholar 

  17. OpenAI et al. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).

  18. Ouyang, L. et al. Training language models to follow instructions with human feedback. In 36th Conf. Neural Inform. Process. Syst. (Morgan Kaufmann, 2022).

  19. Gemini Team et al. Gemini: a family of highly capable multimodal models. Preprint at https://arxiv.org/abs/2312.11805 (2023).

  20. Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).

  21. Touvron, H. et al. LLaMA 2: open foundation and fine-tuned chat models. Preprint at https://arxiv.org/abs/2307.09288 (2023).

  22. Vaswani, A. et al. Attention is all you need. In 31st Conf. Neural Inform. Process. Syst. (Curran Associates, 2017).

  23. Wei, J. et al. Emergent abilities of large language models. Trans. Mach. Learn. Res. https://openreview.net/forum?id=yzkSU5zdwD (2022).

  24. Xu, P., Zhu, X. & Clifton, D. A. Multimodal learning with transformers: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 12113–12132 (2023).

    PubMed  Google Scholar 

  25. Zhang, D. et al. MM-LLMs: recent advances in multimodal large language models. In Find. Assoc. Comput. Linguist. 12401–12430 (ACL, 2024).

  26. Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th Int. Conf. Machine Learning 8748–8763 (PMLR, 2021).

  27. Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. In 37th Conf. Neural Inform. Process. Syst. (NeurIPS, 2023).

  28. Yang, Z. et al. The dawn of LMMs: preliminary explorations with GPT-4V(ision). Preprint at https://arxiv.org/abs/2309.17421 (2023).

  29. Zheng, Z. et al. Image and data mining in reticular chemistry powered by GPT-4V. Digit. Discov. 3, 491–501 (2024).

    Google Scholar 

  30. Zhao, W. X. et al. A survey of large language models. Preprint at https://arxiv.org/abs/2303.18223 (2023).

  31. Naveed, H. et al. A comprehensive overview of large language models. Preprint at https://arxiv.org/abs/2307.06435 (2024).

  32. Ramos, M. C., Collison, C. J. & White, A. D. A review of large language models and autonomous agents in chemistry. Preprint at https://arxiv.org/abs/2407.01603 (2024).

  33. Lei, G., Docherty, R. & Cooper, S. J. Materials science in the era of large language models: a perspective. Digit. Discov. 3, 1257–1272 (2024).

    Google Scholar 

  34. Min, B. et al. Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput. Surv. 56, 1–40 (2024).

    Google Scholar 

  35. Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inf. Process. Syst. 35, 24824–24837 (2022).

    Google Scholar 

  36. Dong, Q. et al. A survey on in-context learning. Preprint at https://arxiv.org/abs/2301.00234 (2024).

  37. Huang, J. & Chang, K. C.-C. Towards reasoning in large language models: a survey. In Find. Assoc. Comput. Linguist. 1049–1065 (ACL, 2023).

  38. Zheng, Z. et al. A GPT-4 reticular chemist for guiding MOF discovery. Angew. Chem. Int. Ed. 62, e202311983 (2023).

    CAS  Google Scholar 

  39. Maik Jablonka, K. et al. 14 examples of how LLMs can transform materials science and chemistry: a reflection on a large language model hackathon. Digit. Discov. 2, 1233–1250 (2023).

    Google Scholar 

  40. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In Proc. 58th Annu. Meet. Assoc. Comput. Linguist. 1906–1919 (ACL, 2020).

  41. Zheng, Z. et al. Shaping the water-harvesting behavior of metal–organic frameworks aided by fine-tuned GPT models. J. Am. Chem. Soc. 145, 28284–28295 (2023).

    CAS  PubMed  Google Scholar 

  42. Zheng, Z. et al. ChatGPT research group for optimizing the crystallinity of MOFs and COFs. ACS Cent. Sci. 9, 2161–2170 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Chung, H. W. et al. Scaling instruction-finetuned language models. J. Mach. Learn. Res. 25, 1–53 (2024).

    Google Scholar 

  44. Wang, Y. et al. Super-natural instructions: generalization via declarative instructions on 1600+ NLP tasks. In Proc. 2022 Conf. Empir. Methods Nat. Lang. Process. 5085–5109 (ACL, 2022).

  45. Kim, S. et al. The CoT collection: improving zero-shot and few-shot learning of language models via chain-of-thought fine-tuning. In Proc. 2023 Conf. Empir. Methods Nat. Lang. Process. (ACL, 2023).

  46. Yao, S. et al. Tree of thoughts: deliberate problem solving with large language models. In 37th Conf. Neural Inform. Process. Syst. (NeurIPS, 2023).

  47. Khattab, O. et al. DSPy: compiling declarative language model calls into self-improving pipelines. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2024).

  48. Wang, X. et al. Self-consistency improves chain of thought reasoning in language models. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2024).

  49. Ji, Z. et al. Towards mitigating LLM hallucination via self reflection. In Find. Assoc. Comput. Linguist. (eds Bouamor, H., Pino, J. & Bali, K.) 1827–1843 (ACL, 2023).

  50. Asai, A., Wu, Z., Wang, Y., Sil, A. & Hajishirzi, H. Self-RAG: learning to retrieve, generate, and critique through self-reflection. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2023).

  51. Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Kang, Y. & Kim, J. ChatMOF: an artificial intelligence system for predicting and generating metal–organic frameworks using large language models. Nat. Commun. 15, 4705 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Ruan, Y. et al. An automatic end-to-end chemical synthesis development platform powered by large language models. Nat. Commun. 15, 10160 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  54. Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Inform. Process. Syst. Vol. 33 9459–9474 (Curran Associates, 2020).

  55. Gao, Y. et al. Retrieval-augmented generation for large language models: a survey. Preprint at https://arxiv.org/abs/2312.10997 (2024).

  56. Liu, N. F. et al. Lost in the middle: how language models use long contexts. Trans. Assoc. Comput. Linguist. 12, 157–173 (2024).

    Google Scholar 

  57. Ruan, Y. et al. Accelerated end-to-end chemical synthesis development with large language models. Preprint at https://doi.org/10.26434/chemrxiv-2024-6wmg4 (2024).

  58. Jablonka, K. M., Schwaller, P., Ortega-Guerrero, A. & Smit, B. Leveraging large language models for predictive chemistry. Nat. Mach. Intell. 6, 161–169 (2024).

    Google Scholar 

  59. Gupta, T., Zaki, M., Krishnan, N. M. A. & Mausam MatSciBERT: a materials domain language model for text mining and information extraction. npj Comput. Mater. 8, 1–11 (2022).

    Google Scholar 

  60. Antunes, L. M., Butler, K. T. & Grau-Crespo, R. Crystal structure generation with autoregressive large language modeling. Nat. Commun. 15, 10570 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Gruver, N. et al. Fine-tuned language models generate stable inorganic materials as text. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2024).

  62. Kim, S., Jung, Y. & Schrier, J. Large language models for inorganic synthesis predictions. J. Am. Chem. Soc. 146, 19654–19659 (2024).

    CAS  PubMed  Google Scholar 

  63. Zhang, W. et al. Fine-tuning large language models for chemical text mining. Chem. Sci. 15, 10600–10611 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Jiang, A. Q. et al. Mistral 7B. Preprint at https://arxiv.org/abs/2310.06825v1 (2023).

  65. Beltagy, I., Lo, K. & Cohan, A. SciBERT: a pretrained language model for scientific text. Preprint at https://arxiv.org/abs/1903.10676 (2019).

  66. Lewis, M. et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proc. 58th Annu. Meet. Assoc. Comput. Linguist. (ACL, 2020).

  67. Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 1–67 (2020).

    Google Scholar 

  68. Hu, E. J. et al. Lora: low-rank adaptation of large language models. In Proc. 10th Int. Conf. Learn. Represent. (ICLR, 2021).

  69. Han, Z., Gao, C., Liu, J., Zhang, J. & Zhang, S. Q. Parameter-efficient fine-tuning for large models: a comprehensive survey. Trans. Mach. Learn. Res. https://openreview.net/forum?id=lIsCS8b6zj (2024).

  70. Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).

    CAS  Google Scholar 

  71. Bai, X., Xie, Y., Zhang, X., Han, H. & Li, J.-R. Evaluation of open-source large language models for metal–organic frameworks research. J. Chem. Inf. Model. 64, 4958–4965 (2024).

    CAS  PubMed  Google Scholar 

  72. Luo, Y. et al. MOF synthesis prediction enabled by automatic data mining and machine learning. Angew. Chem. Int. Ed. 61, e202200242 (2022).

    CAS  Google Scholar 

  73. Park, H., Kang, Y., Choe, W. & Kim, J. Mining insights on metal–organic framework synthesis from scientific literature texts. J. Chem. Inf. Model. 62, 1190–1198 (2022).

    CAS  PubMed  Google Scholar 

  74. Glasby, L. T. et al. DigiMOF: a database of metal–organic framework synthesis information generated via text mining. Chem. Mater. 35, 4510–4524 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Park, S. et al. Text mining metal–organic framework papers. J. Chem. Inf. Model. 58, 244–251 (2018).

    CAS  PubMed  Google Scholar 

  76. Nandy, A. et al. MOFSimplify, machine learning models with extracted stability data of three thousand metal–organic frameworks. Sci. Data 9, 74 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. Nandy, A., Duan, C. & Kulik, H. J. Using machine learning and data mining to leverage community knowledge for the engineering of stable metal–organic frameworks. J. Am. Chem. Soc. 143, 17535–17547 (2021).

    CAS  PubMed  Google Scholar 

  78. Batra, R., Chen, C., Evans, T. G., Walton, K. S. & Ramprasad, R. Prediction of water stability of metal–organic frameworks using machine learning. Nat. Mach. Intell. 2, 704–710 (2020).

    Google Scholar 

  79. Terrones, G. G. et al. Metal–organic framework stability in water and harsh environments from data-driven models trained on the diverse WS24 data set. J. Am. Chem. Soc. 146, 20333–20348 (2024).

    CAS  PubMed  Google Scholar 

  80. Lee, W., Kang, Y., Bae, T. & Kim, J. Harnessing large language model to collect and analyze metal-organic framework property dataset. Preprint at https://arxiv.org/abs/2404.13053.

  81. Rampal, N. et al. Single and multi-hop question-answering datasets for reticular chemistry with GPT-4-turbo. J. Chem. Theory Comput. 20, 9128–9137 (2024).

    CAS  PubMed  Google Scholar 

  82. Ansari, M. & Moosavi, S. M. Agent-based learning of materials datasets from the scientific literature. Digit. Discov. 3, 2607–2617 (2024).

    Google Scholar 

  83. Leong, S. X., Pablo-García, S., Zhang, Z. & Aspuru-Guzik, A. Automated electrosynthesis reaction mining with multimodal large language models (MLLMs). Preprint at https://doi.org/10.26434/chemrxiv-2024-7fwxv (2024).

  84. Liu, S. et al. Conversational Drug Editing Using Retrieval and Domain Feedback. In Proc. 12th Int. Conf. Learn. Represent. (ICLR, 2024).

  85. Ahn, J. et al. Large language models for mathematical reasoning: progresses and challenges. In Proc. 18th Conf. Eur. Ch. Assoc. Comput. Linguist. 225-237 (ACL, 2024)

  86. Pinheiro, M., Martin, R. L., Rycroft, C. H. & Haranczyk, M. High accuracy geometric analysis of crystalline porous materials. CrystEngComm 15, 7531–7538 (2013).

    CAS  Google Scholar 

  87. Willems, T. F., Rycroft, C. H., Kazi, M., Meza, J. C. & Haranczyk, M. Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials. Microporous Mesoporous Mater. 149, 134–141 (2012).

    CAS  Google Scholar 

  88. Pinheiro, M. et al. Characterization and comparison of pore landscapes in crystalline porous materials. J. Mol. Graph. Model. 44, 208–219 (2013).

    CAS  PubMed  Google Scholar 

  89. Sarkisov, L. & Harrison, A. Computational structure characterisation tools in application to ordered and disordered porous materials. Mol. Simul. 37, 1248–1257 (2011).

    CAS  Google Scholar 

  90. Sarkisov, L. & Kim, J. Computational structure characterization tools for the era of material informatics. Chem. Eng. Sci. 121, 322–330 (2015).

    CAS  Google Scholar 

  91. Dubbeldam, D., Calero, S., Ellis, D. E. & Snurr, R. Q. RASPA: molecular simulation software for adsorption and diffusion in flexible nanoporous materials. Mol. Simul. 42, 81–101 (2016).

    CAS  Google Scholar 

  92. Su, Y. et al. Automation and machine learning augmented by large language models in a catalysis study. Chem. Sci. 15, 12200–12233 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  93. Zheng, Z. et al. Integrating machine learning and large language models to advance exploration of electrochemical reactions. Angew. Chem. Int. Ed. 63, e202418074 (2024).

    Google Scholar 

  94. Mahjour, B., Hoffstadt, J. & Cernak, T. Designing chemical reaction arrays using phactor and ChatGPT. Org. Process Res. Dev. 27, 1510–1516 (2023).

    CAS  Google Scholar 

  95. Chiang, W.-L. et al. Chatbot arena: an open platform for evaluating LLMs by human preference. In Proc. 41st Int. Conf. Mach. Learn. (ICML, 2024).

  96. An, Y. et al. Knowledge graph question answering for materials science (KGQA4MAT): developing natural language interface for metal-organic frameworks knowledge graph (MOF-KG) using LLM. In 17th Int. Conf. Metadata Semantics Res. (Springer, 2023).

  97. Shi, L. et al. LLM-based MOFs synthesis condition extraction using few-shot demonstrations. Preprint at https://arxiv.org/abs/2408.04665 (2024).

  98. Rubungo, A. N., Li, K., Hattrick-Simpers, J. & Dieng, A. B. LLM4Mat-bench: benchmarking large language models for materials property prediction. Preprint at https://arxiv.org/abs/2411.00177 (2024).

  99. de Vries, A. The growing energy footprint of artificial intelligence. Joule 7, 2191–2194 (2023).

    Google Scholar 

  100. Wu, C.-J. et al. Sustainable AI: environmental implications, challenges and opportunities. Preprint at https://arxiv.org/abs/2111.00364 (2022).

  101. Xu, M. et al. A survey of resource-efficient LLM and multimodal foundation models. Preprint at https://arxiv.org/abs/2401.08092 (2024).

  102. Stojkovic, J., Choukse, E., Zhang, C., Goiri, I. & Torrellas, J. Towards greener LLMs: bringing energy-efficiency to the forefront of LLM inference. Preprint at https://arxiv.org/abs/2403.20306 (2024).

  103. Morris, M. R. et al. Levels of AGI: operationalizing progress on the path to AGI. Preprint at https://arxiv.org/abs/2311.02462 (2024).

  104. Gropp, C. et al. Standard practices of reticular chemistry. ACS Cent. Sci. 6, 1255–1273 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  105. Li, A. et al. The launch of a freely accessible MOF CIF collection from the CSD. Matter 4, 1105–1106 (2021).

    CAS  Google Scholar 

  106. Yaghi, O. M., Li, G. & Li, H. Selective binding and removal of guests in a microporous metal–organic framework. Nature 378, 703–706 (1995).

    CAS  Google Scholar 

  107. Côté, A. P. et al. Porous, crystalline, covalent organic frameworks. Science 310, 1166–1170 (2005).

    PubMed  Google Scholar 

  108. Park, K. S. et al. Exceptional chemical and thermal stability of zeolitic imidazolate frameworks. Proc. Natl Acad. Sci. USA 103, 10186–10191 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  109. Deng, H. et al. Multiple functional groups of varying ratios in metal–organic frameworks. Science 327, 846–850 (2010).

    CAS  PubMed  Google Scholar 

  110. Liu, Y. et al. Weaving of organic threads into a crystalline covalent organic framework. Science 351, 365–369 (2016).

    CAS  PubMed  Google Scholar 

  111. El-Kaderi, H. M. et al. Designed synthesis of 3D covalent organic frameworks. Science 316, 268–272 (2007).

    CAS  PubMed  Google Scholar 

  112. Yang, J. et al. Principles of designing extra-large pore openings and cages in zeolitic imidazolate frameworks. J. Am. Chem. Soc. 139, 6448–6455 (2017).

    CAS  PubMed  Google Scholar 

  113. Cmarik, G. E., Kim, M., Cohen, S. M. & Walton, K. S. Tuning the adsorption properties of UiO-66 via ligand functionalization. Langmuir 28, 15606–15613 (2012).

    CAS  PubMed  Google Scholar 

  114. Wang, Z. & Cohen, S. M. Postsynthetic covalent modification of a neutral metal−organic framework. J. Am. Chem. Soc. 129, 12368–12369 (2007).

    CAS  PubMed  Google Scholar 

  115. Li, H., Eddaoudi, M., O’Keeffe, M. & Yaghi, O. M. Design and synthesis of an exceptionally stable and highly porous metal–organic framework. Nature 402, 276–279 (1999).

    CAS  Google Scholar 

  116. Seo, J. S. et al. A homochiral metal–organic porous material for enantioselective separation and catalysis. Nature 404, 982–986 (2000).

    CAS  PubMed  Google Scholar 

  117. Ni, Z. & Masel, R. I. Rapid production of metal−organic frameworks via microwave-assisted solvothermal synthesis. J. Am. Chem. Soc. 128, 12394–12395 (2006).

    CAS  PubMed  Google Scholar 

  118. Pichon, A., Lazuen-Garay, A. & James, S. L. Solvent-free synthesis of a microporous metal–organic framework. CrystEngComm 8, 211–214 (2006).

    CAS  Google Scholar 

  119. Wilmer, C. E. et al. Large-scale screening of hypothetical metal–organic frameworks. Nat. Chem. 4, 83–89 (2012).

    CAS  Google Scholar 

  120. Chung, Y. G. et al. Computation-ready, experimental metal–organic frameworks: a tool to enable high-throughput screening of nanoporous crystals. Chem. Mater. 26, 6185–6192 (2014).

    CAS  Google Scholar 

  121. Bobbitt, N. S. et al. MOFX-DB: an online database of computational adsorption data for nanoporous materials. J. Chem. Eng. Data 68, 483–498 (2023).

    CAS  Google Scholar 

  122. Chung, Y. G. et al. Advances, updates, and analytics for the computation-ready, experimental metal–organic framework database: CoRE MOF 2019. J. Chem. Eng. Data 64, 5985–5998 (2019).

    CAS  Google Scholar 

  123. Rosen, A. S. et al. Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery. Matter 4, 1578–1597 (2021).

    CAS  Google Scholar 

  124. Rosi, N. L. et al. Hydrogen storage in microporous metal-organic frameworks. Science 300, 1127–1129 (2003).

    CAS  PubMed  Google Scholar 

  125. Millward, A. R. & Yaghi, O. M. Metal−organic frameworks with exceptionally high capacity for storage of carbon dioxide at room temperature. J. Am. Chem. Soc. 127, 17998–17999 (2005).

    CAS  PubMed  Google Scholar 

  126. Horcajada, P. et al. Metal–organic frameworks as efficient materials for drug delivery. Angew. Chem. Int. Ed. 45, 5974–5978 (2006).

    CAS  Google Scholar 

  127. Feng, D. et al. Zirconium-metalloporphyrin PCN-222: mesoporous metal–organic frameworks with ultrahigh stability as biomimetic catalysts. Angew. Chem. Int. Ed. 51, 10307 (2012).

    CAS  Google Scholar 

  128. Furukawa, H. et al. Water adsorption in porous metal–organic frameworks and related materials. J. Am. Chem. Soc. 136, 4369–4381 (2014).

    CAS  PubMed  Google Scholar 

  129. Zhou, Z. et al. Carbon dioxide capture from open air using covalent organic frameworks. Nature 635, 96–101 (2024).

    CAS  PubMed  Google Scholar 

  130. Sheberla, D. et al. Conductive MOF electrodes for stable supercapacitors with high areal capacitance. Nat. Mater. 16, 220–224 (2017).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank the Defense Advanced Research Projects Agency (DARPA) for the financial support under contract HR0011-21-C-0020. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA.

Author information

Authors and Affiliations

Authors

Contributions

O.M.Y. contributed to writing, review and editing, and supervision. Z.Z. contributed to writing, review and editing, and graphics. N.R., T.J.I., J.T.C. and C.B. contributed to review and editing. Z.Z. used ChatGPT-4 for grammar and typos checking during the review and editing of this manuscript. All authors have read, corrected and verified all information presented in this manuscript.

Corresponding author

Correspondence to Omar M. Yaghi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Materials thanks Seyed Mohamad Moosavi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

API

(Application Programming Interface). A set of rules and protocols that allow different software applications to communicate and share data or commands.

CIFs

(Crystallographic Information Files). A standardized text file format that records crystal structure data, including atomic coordinates and unit cell parameters, enabling consistent sharing of crystal structures.

JSON

(JavaScript Object Notation). A lightweight, text-based format used to structure, store and transfer data between systems in a human-readable manner.

LLM-in-the-loop

A workflow in which a large language model (LLM) continuously participates and provides input, just as a human expert would in a ‘human-in-the-loop’ scenario. The agent may propose actions, analyse data or suggest refinements, and then adapt its guidance based on feedback from experimental results, computational tools or human researchers.

Neural networks

A computational model inspired by the structure of the human brain, composed of layers of interconnected nodes (neurons) that process and learn patterns from data.

SMILES

(Simplified Molecular Input Line Entry System). A textual notation for representing chemical structures, allowing for easy storage, manipulation and computational handling of molecular information.

Tokens

The smallest units of text (such as words, parts of words or symbols) that a language model processes and generates during text analysis and processing.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, Z., Rampal, N., Inizan, T.J. et al. Large language models for reticular chemistry. Nat Rev Mater 10, 369–381 (2025). https://doi.org/10.1038/s41578-025-00772-8

Download citation

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41578-025-00772-8

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics