Abstract
Generative molecular design for drug discovery has recently achieved a wave of experimental validation. Language models operating on string-based representations of molecules are amongst the most successful architectures. The most important factor for downstream success is whether an in silico oracle (computational predictor of a molecule property) is well correlated with the desired end point (such as binding affinity). To this end, current methods use cheaper proxy oracles with a higher throughput before evaluating the most promising subset with high-fidelity oracles. The ability to directly generate molecules with optimal properties as predicted by high-fidelity oracles (computationally expensive simulations with greater predictive accuracy) could greatly enhance generative design and improve hit rates. However, current models are not efficient enough to consider such a prospect, exemplifying the sample efficiency problem. Recently, the Mamba architecture has been proposed as an alternative to transformers, which are widely used in large language models. Existing works have validated Mamba’s performance on tasks spanning natural language completion to biology foundation models. In this work, we introduce a framework called Saturn, which demonstrates the application of the Mamba architecture for generative molecular design. Here we elucidate how experience replay with data augmentation improves the sample efficiency and how Mamba intensifies the effect of this mechanism. Next, we show that Mamba with experience replay outperforms 16 models on multiparameter optimization tasks relevant to drug discovery and possesses sufficient sample efficiency to directly optimize density functional theory simulations as a high-fidelity oracle.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
The datasets used for pretraining are ChEMBL 33 (ref. 50) and ZINC 250k (ref. 52), which can be freely downloaded from https://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_33/ and https://github.com/SeulLee05/GEAM/tree/main/data, respectively. The instructions for processing these datasets are provided in Supplementary Section 1.6. Additionally, the pretrained models are available via GitHub at https://github.com/schwallergroup/saturn/tree/master/experimental_reproduction/checkpoint_models. The archived codebase version used in this work is available via Figshare at https://doi.org/10.6084/m9.figshare.30968380 (refs. 76,77,78,79). Source data are provided with this paper.
Code availability
The prepared files and instructions to reproduce the experiments are available via GitHub at https://github.com/schwallergroup/saturn/tree/master/experimental_reproduction. The archived codebase version used in this work is available via Figshare at https://doi.org/10.6084/m9.figshare.30968380 (ref. 79).
References
Du, Y. et al. Machine learning-aided generative molecular design. Nat. Mach. Intell. 6, 589–604 (2024).
Guo, J. et al. DockStream: a docking wrapper to enhance de novo molecular design. J. Cheminform. 13, 73 (2021).
Yang, S. et al. Hit and lead discovery with explorative RL and fragment-based molecule generation. In Proc. 35th Conference on Neural Information Processing Systems (eds. Ranzato, I. et al.) 7924–7936 (Curran Associates, 2021).
Lee, S. et al. Exploring chemical space with score-based out-of-distribution generation. In Proc. International Conference on Machine Learning (eds. Krause, A. et al.) 17603–17617 (PMLR, 2023).
Lee, S. et al. Drug discovery with dynamic goal-aware fragments. In Proc. International Conference on Machine Learning (eds. Salakhutdinov, R. et al.) 23269–23282 (PMLR, 2024).
Crivelli-Decker, J. E. et al. Machine learning guided AQFEP: a fast and efficient absolute free energy perturbation solution for virtual screening. J. Chem. Theory Comput. 20, 7188–7198 (2024).
Wang, L. et al. in Biomolecular Simulations: Methods and Protocols 201–232 (Springer, 2019).
Eckmann, P. et al. MF-LAL: drug compound generation using multi-fidelity latent space active learning. In Proc. International Conference on Machine Learning (eds. Singh, A. et al.) 14972–14988 (PMLR, 2025).
Neves, B. J. et al. QSAR-based virtual screening: advances and applications in drug discovery. Front. Pharmacol. 9, 1088 (2018).
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Bjerrum, E. J. & Threlfall, R. Molecular generation with recurrent neural networks (RNNs). Preprint at http://arxiv.org/abs/1705.04612 (2017).
Olivecrona, M. et al. Molecular de novo design through deep reinforcement learning. J. Cheminform. 9, 48 (2017).
Segler, M. H. et al. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4, 120–131 (2017).
Popova, M. et al. Deep reinforcement learning for de novo drug design. Sci. Adv. 4, eaap7885 (2018).
Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems (eds. Guyon, I. et al.) 5998–6008 (Curran Associates, 2017).
Radford, A. et al. Language models are unsupervised multitask learners. OpenAI Technical Report (2019).
Bagal, V. et al. MolGPT: molecular generation using a transformer-decoder model. J. Chem. Inf. Model.62, 2064–2076 (2021).
He, J. et al. Evaluation of reinforcement learning in transformer-based molecular design. J. Cheminform. 16, 18 (2024).
Born, J. & Manica, M. Regression transformer enables concurrent sequence regression and generation for molecular language modelling. Nat. Mach. Intell. 5, 432–444 (2023).
Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at http://arxiv.org/abs/1312.6114 (2013).
Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).
Jin, W. et al. Junction tree variational autoencoder for molecular graph generation. In Proc. International Conference on Machine Learning (eds. Dy, J. & Krause, A.) 2323–2332 (PMLR, 2018).
Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).
Goodfellow, I. et al. Generative adversarial nets. In Proc. Advances in Neural Information Processing Systems (eds. Ghahramani, Z. et al.) 2672–2680 (Curran Associates, 2014).
De Cao, N. & Kipf, T. MolGAN: an implicit generative model for small molecular graphs. In Proc. International Conference on Machine Learning Workshop on Theoretical Foundations and Applications of Deep Generative Models (PMLR, 2018).
Jin, W. et al. Multi-objective molecule generation using interpretable substructures. In Proc. International Conference on Machine Learning (eds. Daumé III, H. & Singh, A.) 4849–4859 (PMLR, 2020).
Mercado, R. et al. Graph networks for molecular design. Mach. Learn. Sci. Technol. 2, 025023 (2021).
Bengio, E. et al. Flow network based generative models for non-iterative diverse candidate generation. In Proc. Advances in Neural Information Processing Systems (eds. Ranzato, M. et al.) 27304–27317 (Curran Associates, 2021).
Jensen, J. H. A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space. Chem. Sci. 10, 3567–3572 (2019).
Igashov, I. et al. Equivariant 3D-conditional diffusion model for molecular linker design. Nat. Mach. Intell. 6, 417–427 (2024).
Schneuing, A. et al. Structure-based drug design with equivariant diffusion models. Nat. Comput. Sci. 4, 899–909 (2024).
Jiang, Y. et al. PocketFlow is a data-and-knowledge-driven structure-based molecular generative model. Nat. Mach. Intell. 6, 326–337 (2024).
Njirjak, M. et al. Reshaping the discovery of self-assembling peptides with generative AI guided by hybrid deep learning. Nat. Mach. Intell. 6, 1487–1500 (2024).
Gao, W. et al. Sample efficiency matters: a benchmark for practical molecular optimization. In Proc. Advances in Neural Information Processing Systems Datasets and Benchmarks Track (eds. Koyejo, S. et al.) 21342–21357 (Curran Associates, 2022).
Polykovskiy, D. et al. Molecular sets (MOSES): a benchmarking platform for molecular generation models. Front. Pharmacol. 11, 565644 (2020).
Brown, N. et al. GuacaMol: benchmarking models for de novo molecular design. J. Chem. Inf. Model. 59, 1096–1108 (2019).
Krenn, M. et al. Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach. Learn. Sci. Technol. 1, 045024 (2020).
Özçelik, R. et al. Chemical language modeling with structured state space sequence models. Nat. Commun. 15, 2193 (2024).
Bjerrum, E. J. SMILES enumeration as data augmentation for neural network modeling of molecules. Preprint at http://arxiv.org/abs/1703.07076 (2017).
Arús-Pous, J. et al. Randomized SMILES strings improve the quality of molecular generative models. J. Cheminform. 11, 71 (2019).
Moret, M. et al. Generative molecular design in low data regimes. Nat. Mach. Intell. 2, 171–180 (2020).
Skinnider, M. A. et al. Chemical language models enable navigation in sparsely populated chemical space. Nat. Mach. Intell. 3, 759–770 (2021).
Bjerrum, E. J. et al. Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES. J. Comput.-Aided Mol. Des. 37, 373–394 (2023).
Guo, J. & Schwaller, P. Augmented Memory: sample-efficient generative molecular design with reinforcement learning. JACS Au 4, 2160–2172 (2024).
Ballarotto, M. et al. De novo design of Nurr1 agonists via fragment-augmented generative deep learning in low-data regime. J. Med. Chem. 66, 8170–8177 (2023).
Guo, J. & Schwaller, P. Beam enumeration: probabilistic explainability for sample efficient self-conditioned molecular design. In Proc. International Conference on Learning Representations (2024).
Lin, L.-J. Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8, 293–321 (1992).
Gu, A. & Dao, T. Mamba: linear-time sequence modeling with selective state spaces. In Proc. Conference on Language Modeling (2024).
Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012).
McInnes, L. et al. UMAP: uniform manifold approximation and projection for dimension reduction. J. Open Source Softw. 3, 861 (2018).
Sterling, T. & Irwin, J. J. ZINC 15—ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
Alhossary, A. et al. Fast, accurate, and reliable molecular docking with QuickVina 2. Bioinformatics 31, 2214–2216 (2015).
Ertl, P. & Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 1, 8 (2009).
Bickerton, G. R. et al. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90–98 (2012).
Benhenda, M. ChemGAN challenge for drug discovery: can AI reproduce natural chemical diversity? Preprint at http://arxiv.org/abs/1708.08227 (2017).
Xie, Y. et al. How much space has been explored? Measuring the chemical space covered by databases and machine-generated molecules. In Proc. International Conference on Learning Representations (2023).
Guo, J. et al. Improving de novo molecular design with curriculum learning. Nat. Mach. Intell. 4, 555–563 (2022).
Yuan, Q. et al. Molecular generation targeting desired electronic properties via deep generative models. Nanoscale 2, 6744–6758 (2020).
Zhang, Z. et al. An equivariant generative framework for molecular graph-structure Co-design. Chem. Sci. 14, 8380–8392 (2023).
Wang, C. et al. Recent developments and applications of the MMPBSA method. Front. Mol. Biosci. 4, 87 (2017).
Loeffler, H. et al. Optimal molecular design: generative active learning combining REINVENT with absolute binding free energy simulations. J. Chem. Theory Comput. 20, 308–328 (2024).
Medcalf, M. et al. Overcoming DMTA cycle challenges: a unified AI-driven system for efficient drug design. Preprint at ChemRxiv https://doi.org/10.26434/chemrxiv-2024-0z7g6-v2 (2025).
Guo, J. & Schwaller, P. Directly optimizing for synthesizability in generative molecular design using retrosynthesis models. Chem. Sci. 16, 6943–6956 (2025).
Guo, J. et al. Generative molecular design with steerable and granular synthesizability control. Preprint at http://arxiv.org/abs/2505.08774 (2025).
Van Tilborg, D. et al. Exposing the limitations of molecular machine learning with activity cliffs. J. Chem. Inf. Model. 62, 5938–5951 (2022).
Nigam, A., Pollice, R. & Aspuru-Guzik, A. Parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design. Digit. Discov. 1, 390–404 (2022).
Thomas, M. et al. Test-time training scaling laws for chemical exploration in drug design. J. Chem. Inf. Model. 65, 13178–13186 (2025).
Chen, J. Mamba no. 5 (A Little Bit Of…). Sparse Notes https://jameschen.io/jekyll/update/2024/02/12/mamba.html
Gu, A. et al. Combining recurrent, convolutional, and continuous-time models with linear state space layers. In Proc. Advances in Neural Information Processing Systems (eds. Ranzato, M. et al.) 572–585 (Curran Associates, 2021).
Gu, A. et al. Efficiently modeling long sequences with structured state spaces. In Proc. International Conference on Learning Representations (2022).
Fialková, V. et al. LibINVENT: reaction-based generative scaffold decoration for in silico library design. J. Chem. Inf. Model. 62, 2046–2063 (2021).
Blaschke, T. et al. Memory-assisted reinforcement learning for diverse molecular de novo design. J. Cheminform. 12, 68 (2020).
Bemis, G. W. & Murcko, M. A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).
RDKit: open-source cheminformatics; https://www.rdkit.org
Nigam, A. et al. Augmenting genetic algorithms with deep neural networks for exploring the chemical space. In Proc. International Conference on Learning Representations (2020).
Xie, Y. et al. Mars: Markov molecular sampling for multi-objective drug discovery. In Proc. International Conference on Learning Representations (2021).
Kong, X. et al. Molecule generation by principal subgraph mining and assembling. In Proc. Advances in Neural Information Processing Systems (eds. Koyejo, S. et al.) 2550–2563 (Curran Associates, 2022).
Gua, J. Saturn codebase (as used in paper). Figshare https://doi.org/10.6084/m9.figshare.30968380 (2025).
Shi, C. et al. GraphAF: a flow-based autoregressive model for molecular graph generation. In Proc. International Conference on Learning Representations (2020).
Jeon, W. & Kim, D. Autonomous molecule generation using reinforcement learning and docking to develop potential novel inhibitors. Sci. Rep. 10, 22146 (2020).
Jin, W. et al. Hierarchical generation of molecular graphs using structural motifs. In Proc. International Conference on Machine Learning (eds. Daumé III, H. & Singh, A.) 4833–4842 (PMLR, 2020).
Luo, Y. et al. GraphDF: a discrete flow model for molecular graph generation. In Proc. International Conference on Machine Learning (eds. Meila, M. & Zhang, T.) 7192–7202 (PMLR, 2021).
Eckmann, P. et al. LIMO: latent inceptionism for targeted molecule generation. In Proc. International Conference on Machine Learning (eds. Chaudhuri, K. et al.) 2863–2873 (PMLR, 2022).
Jo, J. et al. Score-based generative modeling of graphs via the system of stochastic differential equations. In Proc. International Conference on Machine Learning (eds. Chaudhuri, K. et al.) 10360–10372 (PMLR, 2022).
Kim, H. et al. Genetic-guided GFlowNets for sample efficient molecular optimization. In Proc. Advances in Neural Information Processing Systems (eds. Globerson, A. et al.) 42618–42648 (Curran Associates, 2024).
Acknowledgements
J.G. (PGSD-521528389) and A.G.X.-C. (PGSD3-559278-2021) are supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). This publication was created as part of NCCR Catalysis (grant number 225147), a National Centre of Competence in Research funded by the Swiss National Science Foundation.
Author information
Authors and Affiliations
Contributions
J.G. and P.S. proposed the project. J.G. wrote the Saturn framework code, performed and analysed the experiments, and wrote the manuscript. J.C. wrote the code to run DFT and helped design the DFT experiment. A.G.X.-C. helped the technical design and interpretation of the experiments elucidating the mechanism of Augmented Memory. P.S. provided feedback and supervised the project.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks Karl Grantham, Ramil Nugmanov and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Supplementary information
Supplementary Information (download PDF )
Supplementary Sections 1–6, Figs. 1–6 and Tables 1–31.
Source data
Source Data Fig. 2 (download XLSX )
Values shown in Fig. 2a,b.
Source Data Fig. 3 (download XLSX )
Raw reward values for the three methods shown in Fig. 3: Saturn (framework), GEAM and ZINC 250k sampling. Values provided for all five docking protein targets.
Source Data Fig. 4 (download XLSX )
Raw data for the three fractions used in Fig. 4.
Source Data Table 1 (download XLSX )
Raw hit ratios for the models we ran: Augmented Memory, GEAM, Saturn (framework) and genetic generative flow networks.
Source Data Fig. 6 (download XLSX )
Raw GEAM and Saturn (framework) values for all metrics in the table.
Source Data Extended Data Fig./Table 1 (download XLSX )
Values in the table.
Source Data Extended Data Fig./Table 2 (download XLSX )
Raw novel hit ratio values for the models we ran: GEAM, Saturn (framework) and Saturn–Tanimoto (framework).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, J., Chen, J., GX-Chen, A. et al. Sample-efficient generative molecular design using memory manipulation. Nat Mach Intell 8, 449–460 (2026). https://doi.org/10.1038/s42256-026-01200-4
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s42256-026-01200-4


