Abstract
The pre-trained knowledge compressed in large language models is addressing diverse scientific challenges and catalysing the progression of autonomous laboratory systems, synergized with liquid handling robots. Here we introduce PrimeGen, an orchestrated multi-agent system powered by large language models, designed to streamline labour-intensive primer design tasks for targeted next-generation sequencing. PrimeGen uses GPT-4o as a central controller to engage with experimentalists for task planning and decomposition, coordinating various specialized agents to execute distinct subtasks. These include an interactive search agent for retrieving gene targets from databases, a primer agent for designing primer sequences across multiple scenarios, a protocol agent for generating executable robot scripts through retrieval-augmented generation and prompt engineering, and an experiment agent equipped with a vision language model for detecting and reporting anomalies. We experimentally demonstrate the effectiveness of PrimeGen across a variety of applications. PrimeGen can accommodate up to 955 amplicons, ensuring high amplification uniformity and minimizing dimer formation. Our development underscores the potential of collaborative agents, coordinated by generalist foundation models, as intelligent tools for advancing biomedical research.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
Publicly available datasets were used in this study. Global health data were obtained from the World Health Organization (https://apps.who.int/iris/bitstream/handle/10665/341906/WHO-UCN-GTB-PCI-2021.7-eng.xlsx). The dataset comprises clinically annotated genetic variants in VCF format, and was systematically retrieved from the ClinVar database (https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/) and curated95. UniProt datasets were dynamically accessed via the UniProt RESTful API based on search criteria such as gene or protein names (query example, http://rest.uniprot.org/uniprotkb/search?query = (gene:{Gene_Names})&format=json). Genome sequences were searched and screened through the NCBI Genome database (https://ftp.ncbi.nlm.nih.gov/genomes/). Species classification information was sourced from the NCBI Taxonomy database (https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/) and curated96. Certain datasets, such as those from the Comprehensive Antibiotic Resistance Database (CARD), were obtained upon request (https://card.mcmaster.ca/download). Data from OMIM (https://www.omim.org) and COSMIC (https://cancer.sanger.ac.uk/cosmic) require independent access requests and are not publicly redistributable. The GRCh38 (hg38) reference genome was used for expanded carrier screening (ECS) panel design, with exon regions annotated based on Gencode v46. Nucleotide sequences for SARS-CoV-2 were retrieved from the NCBI Virus SARS-CoV-2 Data Hub (https://www.ncbi.nlm.nih.gov/activ). Multiple sequence alignment of 20,000 SARS-CoV-2 genomes was performed using MAFFT, with NC_045512.2 (Wuhan-Hu-1 isolate) as the reference genome. Source data supporting the findings of this study are provided with the paper97.
Owing to regulatory and ethical considerations, access to certain restricted datasets may require specific approvals. The main findings of this study can be replicated using the publicly available datasets listed above. Reviewers were provided controlled access to restricted datasets for validation purposes. For further information regarding restricted data access, readers are advised to contact the relevant data repositories directly. Source data are provided with this paper.
Code availability
PrimeGen is written in Python using a Docker container. The source code can be accessed at
https://github.com/melobio/PrimeGen under the GPLv3 license. The doi of the GitHub repository for PrimeGen is provided by the Zenodo link98.
References
Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).
Bennett, J. A. et al. Autonomous reaction Pareto-front mapping with a self-driving catalysis laboratory. Nat. Chem. Eng. 1, 240–250 (2024).
Slattery, A. et al. Automated self-optimization, intensification, and scale-up of photocatalysis in flow. Science 383, eadj1817 (2024).
Bryant, J. A. Jr, Kellinger, M., Longmire, C., Miller, R. & Wright, R. C. AssemblyTron: flexible automation of DNA assembly with Opentrons OT-2 lab robots. Synth. Biol. 8, ysac032 (2023).
Volk, A. A. et al. AlphaFlow: autonomous discovery and optimization of multi-step chemistry using a self-driven fluidic lab guided by reinforcement learning. Nat. Commun. 14, 1403 (2023).
Wierenga, R. P., Golas, S. M., Ho, W., Coley, C. W. & Esvelt, K. M. PyLabRobot: an open-source, hardware-agnostic interface for liquid-handling robots and accessories. Device 1, 100111 (2023).
Liu, L., Huang, Y. & Wang, H. H. Fast and efficient template-mediated synthesis of genetic variants. Nat. Methods 20, 841–848 (2023).
Huang, Y. et al. High-throughput microbial culturomics using automation and machine learning. Nat. Biotechnol. 41, 1424–1433 (2023).
Dama, A. C. et al. BacterAI maps microbial metabolism without prior knowledge. Nat. Microbiol. 8, 1018–1025 (2023).
Vemprala, S. H., Bonatti, R., Bucker, A. & Kapoor, A. ChatGPT for robotics: design principles and model abilities. IEEE Access 12, 55682–55696 (2024).
Achiam, J. et al. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).
Yao, S. et al. ReAct: synergizing reasoning and acting in language models. In Eleventh International Conference on Learning Representations (ICLR, 2023).
Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inf. Process. Syst. 35, 24824–24837 (2022).
Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).
Romera-Paredes, B. et al. Mathematical discoveries from program search with large language models. Nature 625, 468–475 (2024).
Yang, C. et al. Large language models as optimizers. In Twelfth International Conference on Learning Representations (ICLR, 2024).
Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. Adv. Neural Inf. Process. Syst. 33, 9459–9474 (2020).
Xi, Z. et al. The rise and potential of large language model based agents: a survey. Sci. China Inf. Sci. 68, 121101 (2025).
Zhou, W. et al. Agents: an open-source framework for autonomous language agents. In Twelfth International Conference on Learning Representations (ICLR, 2024).
Shen, Y. et al. HuggingGPT: solving AI tasks with ChatGPT and its friends in hugging face. Adv. Neural Inf. Process. Syst. 36, 38154–38180 (2023).
Qian, C. et al. Communicative agents for software development. Preprint at https://arxiv.org/abs/2307.07924 (2023).
Wu, Q. et al. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversations. In First Conference on Language Modeling (2024).
Ghafarollahi, A. & Buehler, M. J. SciAgents: automating scientific discovery through bioinspired multi-agent intelligent graph reasoning. Adv. Mater. 37, 2413523 (2025).
Yang, Z. et al. MM-REACT: prompting ChatGPT for multimodal reasoning and action. Preprint at https://arxiv.org/abs/2303.11381 (2023).
Patil, S. G., Zhang, T., Wang, X. & Gonzalez, J. E. Gorilla: large language model connected with massive APIs. Adv. Neural Inf. Process. Syst. 37, 126544–126565 (2024).
Messeri, L. & Crockett, M. J. Artificial intelligence and illusions of understanding in scientific research. Nature 627, 49–58 (2024).
Krenn, M. et al. On scientific understanding with artificial intelligence. Nat. Rev. Phys. 4, 761–769 (2022).
Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).
Bran, A. M. et al. Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6, 525–535 (2024).
Darvish, K. et al. ORGANA: a robotic assistant for automated chemistry experimentation and characterization. Matter 8, 101897 (2025).
Dai, T. et al. Autonomous mobile robots for exploratory synthetic chemistry. Nature 635, 890–897 (2024).
Gao, S. et al. Empowering biomedical discovery with AI agents. Cell 187, 6125–6151 (2024).
Xiao, M. et al. Multiple approaches for massively parallel sequencing of SARS-CoV-2 genomes directly from clinical samples. Genome Med. 12, 57 (2020).
Kunasol, C. et al. Comparative analysis of targeted next-generation sequencing for Plasmodium falciparum drug resistance markers. Sci. Rep. 12, 5563 (2022).
Nozawa, A. et al. Comprehensive targeted next-generation sequencing in patients with slow-flow vascular malformations. J. Hum. Genet. 67, 721–728 (2022).
Rawat, A. et al. Utility of targeted next generation sequencing for inborn errors of immunity at a tertiary care centre in North India. Sci. Rep. 12, 10416 (2022).
Jan, Y.-H. et al. Comprehensive assessment of actionable genomic alterations in primary colorectal carcinoma using targeted next-generation sequencing. Br. J. Cancer 127, 1304–1311 (2022).
Xie, N. G. et al. Designing highly multiplex PCR primer sets with simulated annealing design using dimer likelihood estimation (SADDLE). Nat. Commun. 13, 1881 (2022).
Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K. & Yao, S. Reflexion: language agents with verbal reinforcement learning. Adv. Neural Inf. Process. Syst. 36, 8634–8652 (2023).
Madaan, A. et al. Self-refine: iterative refinement with self-feedback. Adv. Neural Inf. Process. Syst. 36, 46534–46594 (2023).
Szymanski, N. J. et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature 624, 86–91 (2023).
Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2015).
Sondka, Z. et al. COSMIC: a curated database of somatic variants and clinical data for cancer. Nucleic Acids Res. 52, D1210–D1217 (2024).
Landrum, M. J. et al. ClinVar: improvements to accessing data. Nucleic Acids Res. 48, D835–D844 (2020).
UniProt Consortium UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015).
Federhen, S. The NCBI taxonomy database. Nucleic Acids Res. 40, D136–D143 (2012).
The WHO Global Tuberculosis Report 2022 (WHO, 2022); https://www.who.int/teams/global-tuberculosis-programme/tb-reports/global-tuberculosis-report-2022
McArthur, A. G. et al. The comprehensive antibiotic resistance database. Antimicrob. Agents Chemother. 57, 3348–3357 (2013).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform. 10, 1–9 (2009).
Wang, M. X. et al. Olivar: towards automated variant aware primer design for multiplex tiled amplicon sequencing of pathogens. Nat. Commun. 15, 6306 (2024).
Xia, H. et al. MultiPrime: a reliable and efficient tool for targeted next-generation sequencing. iMeta 2, e143 (2023).
Wang, K. et al. MFEprimer-3.0: quality control for PCR primers. Nucleic Acids Res. 47, W610–W613 (2019).
Dreier, M., Berthoud, H., Shani, N., Wechsler, D. & Junier, P. SpeciesPrimer: a bioinformatics pipeline dedicated to the design of qPCR primers for the quantification of bacterial species. PeerJ 8, e8544 (2020).
Yang, L. et al. A tool to automatically design multiplex PCR primer pairs for specific targets using diverse templates. Sci. Rep. 13, 16451 (2023).
Yuan, J. et al. The web-based multiplex PCR primer design software Ultiplex and the associated experimental workflow: up to 100-plex multiplicity. BMC Genom. 22, 835 (2021).
Ghezzi, H. et al. PUPpy: a primer design pipeline for substrain-level microbial detection and absolute quantification. mSphere 9, e00360–00324 (2024).
dnasoftware (dnasoftware); https://www.dnasoftware.com/ (2025).
SantaLucia, J. Jr & Hicks, D. The thermodynamics of DNA structural motifs. Annu. Rev. Biophys. Biomol. Struct. 33, 415–440 (2004).
Untergasser, A. et al. Primer3—new capabilities and interfaces. Nucleic Acids Res. 40, e115 (2012).
Sinai, S. et al. AdaLead: a simple and robust adaptive greedy search algorithm for sequence design. Preprint at https://arxiv.org/abs/2010.02141 (2020).
Goldberg, D. E. Genetic Algorithms in Search, Optimization and Machine Learning (Addison Wesley Publishing Company, 1989).
hCoV-2019/nCoV-2019 Version 3 Amplicon Set (ARTIC, 2020); https://artic.network/resources/ncov/ncov-amplicon-v3.pdf
ARTIC v5.3.2 (ARTIC); https://github.com/quick-lab/SARS-CoV-2/blob/main/400/v5.3.2_400/pooling
Quick, J. et al. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nat. Protoc. 12, 1261–1276 (2017).
Edwards, J. G. et al. Expanded carrier screening in reproductive medicine—points to consider: a joint statement of the American College of Medical Genetics and Genomics, American College of Obstetricians and Gynecologists, National Society of Genetic Counselors, Perinatal Quality Foundation, and Society for Maternal-Fetal Medicine. Obstet. Gynecol. 125, 653–662 (2015).
Goldberg, J. D., Pierson, S. & Johansen Taber, K. Expanded carrier screening: what conditions should we screen for? Prenat. Diagn. 43, 496–505 (2023).
Cabibbe, A. M. et al. Application of targeted next-generation sequencing assay on a portable sequencing platform for culture-free detection of drug-resistant tuberculosis from clinical samples. J. Clin. Microbiol. 58, 10–1128 (2020).
Dookie, N., Khan, A., Padayatchi, N. & Naidoo, K. Application of next generation sequencing for diagnosis and clinical management of drug-resistant tuberculosis: updates on recent developments in the field. Front. Microbiol. 13, 775030 (2022).
Catalogue of Mutations in Mycobacterium tuberculosis Complex and Their Association with Drug Resistance (WHO, 2021); https://www.who.int/publications/i/item/9789240028173
Butler, W. R. & Guthertz, L. S. Mycolic acid analysis by high-performance liquid chromatography for identification of Mycobacterium species. Clin. Microbiol. Rev. 14, 704–726 (2001).
Ni, G. et al. Novel multiplexed amplicon-based sequencing to quantify SARS-CoV-2 RNA from wastewater. Environ. Sci. Technol. Lett. 8, 683–690 (2021).
Vanella, R., Kovacevic, G., Doffini, V., de Santaella, J. F. & Nash, M. A. High-throughput screening, next generation sequencing and machine learning: advanced methods in enzyme engineering. Chem. Commun. 58, 2455–2467 (2022).
Nakatsu, T. et al. Structural basis for the spectral difference in luciferase bioluminescence. Nature 440, 372–376 (2006).
Hashimoto, H. et al. Crystal structure of DNA polymerase from hyperthermophilic archaeon Pyrococcus kodakaraensis KOD1. J. Mol. Biol. 306, 469–477 (2001).
Lunde, B. M., Magler, I. & Meinhart, A. Crystal structures of the Cid1 poly (U) polymerase reveal the mechanism for UTP selectivity. Nucleic Acids Res. 40, 9815–9824 (2012).
Lu, X. et al. Enzymatic DNA synthesis by engineering terminal deoxynucleotidyl transferase. ACS Catal. 12, 2988–2997 (2022).
MGI AlphaTool (MGI); https://www.mgi-tech.com/647 (2024).
Khot, T. et al. Decomposed prompting: a modular approach for solving complex tasks. In Eleventh International Conference on Learning Representations (ICLR, 2023).
Li, C. et al. LLaVA-Med: training a large language-and-vision assistant for biomedicine in one day. Adv. Neural Inf. Process. Syst. 36, 28541–28564 (2023).
Jetson Nano (NVIDIA); https://developer.nvidia.com/embedded/jetson-nano (2019).
Taymans, W., Baker, S., Wingo, A., Bultje, R. S. & Kost, S. Gstreamer application development manual (1.2.3). https://gstreamer.freedesktop.org/ (2013).
Hong, Y. et al. 3D-LLM: injecting the 3D world into large language models. Adv. Neural Inf. Process. Syst. 36, 20482–20494 (2023).
Wang, P. et al. Qwen2-vl: enhancing vision-language model’s perception of the world at any resolution. Preprint at https://arxiv.org/abs/2409.12191 (2024).
Hu, E. J. et al. LoRA: low-rank adaptation of large language models. In Tenth International Conference on Learning Representations 1, 3 (ICLR, 2022).
Yao, S. et al. Tree of thoughts: deliberate problem solving with large language models. Adv. Neural Inf. Process. Syst. 36, 11809–11822 (2023).
Zelikman, E. et al. Quiet-STaR: language models can teach themselves to think before speaking. Preprint at https://arxiv.org/abs/2403.09629 (2024).
Liu, Z. et al. Inference-time scaling for generalist reward modeling. Preprint at https://arxiv.org/abs/2504.02495 (2025).
Frankish, A. et al. GENCODE: reference annotation for the human and mouse genomes in 2023. Nucleic Acids Res. 51, D942–D949 (2023).
Wright, C. F., FitzPatrick, D. R., Ware, J. S., Rehm, H. L. & Firth, H. V. Importance of adopting standardized MANE transcripts in clinical reporting. Genet. Med. 25, 100331 (2023).
Morales, J. et al. A joint NCBI and EMBL-EBI transcript set for clinical genomics and research. Nature 604, 310–315 (2022).
SARS-CoV-2 Variants Overview (NCBI Virus, 2004–2024); https://www.ncbi.nlm.nih.gov/activ
Katoh, K., Misawa, K., Kuma, K. I. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).
Abdin, M. et al. Phi-3 technical report: a highly capable language model locally on your phone. Preprint at https://arxiv.org/abs/2404.14219 (2024).
Yao, Y. et al. MiniCPM-V: a GPT-4V level MLLM on your phone. Preprint at https://arxiv.org/abs/2408.01800 (2024).
Hui, T. Gene data from Clinvar. figshare https://doi.org/10.6084/m9.figshare.28876808.v4 (2025).
Hui, T. Species identification data from NCBI. figshare https://doi.org/10.6084/m9.figshare.28877087.v1 (2025).
Hui, T. PrimeGen Figs. 2–4 Source Data. figshare https://doi.org/10.6084/m9.figshare.28876844.v1 (2025).
melobio. melobio/PrimeGen: V1.0.1 (V1.0.1). Zenodo https://doi.org/10.5281/zenodo.15279353 (2025).
Acknowledgements
This research is supported by the Ministry of Science and Technology of the People’s Republic of China’s programme titled ‘National Key Research and Development Program of China’ (2022YFF1202200).
Author information
Authors and Affiliations
Contributions
M.Y. conceived the problem and designed all studies. Y.W. assisted and oversaw the computational pipeline. Y. Hou and H.T. developed the ‘search’ and ‘experiment’ agent. H.T., Y. Hou, W.T. and Y.W. developed the ‘protocol’ agent. L.Y., W.T., H.Z. and X.L. planned and executed the library construction experiments. Y.W. and H.T. developed the ‘primer’ agent. S. Li, Y. Hou and S.C. developed the LLM ‘controller’. Y. Huang set up the cameras for video collection in VLM supervision. L.K. and Y. Hou implemented the anomaly detection module in the ‘experiment’ agent. Q.H. assisted in developing the ‘search’ and ‘primer’ agent. J.W., H.Y., D.Y. and F.M. provided strategic guidance. N.H. provided suggestions for biomedical applications. S. Lin provided suggestions for automation systems. Y.Z. provided suggestions for PCR experiment design. M.Y. and Y.W. wrote the paper, and others made modifications.
Corresponding authors
Ethics declarations
Competing interests
J.W., D.Y. and F.M. declare stock holdings in MGI. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Biomedical Engineering thanks Wei Chen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 The dialog example for pathogen detection.
Profile picture: Pink: controller, Blue: The Search Agent, White: The Primer Agent, the user is on the right.
Extended Data Fig. 2 The dialog example for protein mutation analysis.
Profile picture: Pink: controller, Blue: The Search Agent, White: The Primer Agent, the user is on the right.
Extended Data Fig. 3 The dialog example for genetic disorder.
Profile picture: Pink: controller, Blue: The Search Agent, White: The Primer Agent, the user is on the right.
Extended Data Fig. 4 The dialog example for SNP with reference.
Profile picture: Pink: controller, Blue: The Search Agent, White: The Primer Agent, the user is on the right.
Extended Data Fig. 5 The dialog example for cancer drug target.
Profile picture: Pink: controller, Blue: The Search Agent, White: The Primer Agent, the user is on the right.
Extended Data Fig. 6 The dialog example for Whole genome detection.
Profile picture: Pink: controller, Blue: The Search Agent, White: The Primer Agent, the user is on the right.
Extended Data Fig. 7 The dialog example for redesign primers.
Profile picture: Pink: controller, Blue: The Search Agent, White: The Primer Agent, the user is on the right.
Supplementary information
Supplementary Information
Supplementary Notes, Figs. 1–5, Tables 1–8 and references.
Supplementary Data 1
The examples of typical queries and the corresponding retrieved links for the five primer design scenarios.
Supplementary Data 2
The primer pool files and the sequencing analysis data for the four benchmarking approaches for the SARS-CoV-2 task.
Supplementary Data 3
The primer pool file, the 35 associated genes and the sequencing results data for the ECS panel design task.
Supplementary Data 4
The primer pool files for the task of detecting drug resistance mutations in MTB.
Supplementary Data 5
The primer pool files for Gluc, KOD, Cid1 and TdT in the protein mutation detection task.
Supplementary Data 6
The expert-provided seed data and LLM prompts for generating training data for the two-stage fine-tuning of the VLM for detecting abnormal events.
Source data
Source Data Fig. 2
The loss values for each iteration of the panel optimization process across the four benchmarking approaches (LLM, GA, AdaLead and Greedy) in Fig. 2e(i) and Fig. 2e(ii).
Source Data Fig. 3
Statistical source data in Fig. 3.
Source Data Fig. 3
Uncropped gel images of PCR products from SARS-CoV-2 panels for ARTIC, PrimalScheme, PrimeGen and PrimeGen-Primer3; uncropped gel image of PCR products from PrimeGen ECS panel; uncropped gel images of PCR products from PrimeGen MTB panels for round 1 and round 2 design; uncropped gel image of PCR products from PrimeGen mixing (Gluc KODm Cid1 TdT) plasmid panel for round 1 design; uncropped gel image of PCR products from PrimeGen TdT plasmid panel for round 2 design.
Source Data Fig. 4
Benchmarking of the LLM base model for three scenarios: target sequence retrieval, LLM panel optimization and protocol code modification (Fig. 4e).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Hou, Y., Yang, L. et al. Accelerating primer design for amplicon sequencing using large language model-powered agents. Nat. Biomed. Eng (2025). https://doi.org/10.1038/s41551-025-01455-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41551-025-01455-z