Abstract
Efficient methods for the generation of specific mutations enable the study of functional variations in natural populations and lead to advances in genetic engineering applications. Here, we present a new approach, mutagenesis by template-guided amplicon assembly (MEGAA), for the rapid construction of kilobase-sized DNA variants. With this method, many mutations can be generated at a time to a DNA template at more than 90% efficiency per target in a predictable manner. We devised a robust and iterative protocol for an open-source laboratory automation robot that enables desktop production and long-read sequencing validation of variants. Using this system, we demonstrated the construction of 31 natural SARS-CoV2 spike gene variants and 10 recoded Escherichia coli genome fragments, with each 4 kb region containing up to 150 mutations. Furthermore, 125 defined combinatorial adeno-associated virus-2 cap gene variants were easily built using the system, which exhibited viral packaging enhancements of up to 10-fold compared with wild type. Thus, the MEGAA platform enables generation of multi-site sequence variants quickly, cheaply, and in a scalable manner for diverse applications in biotechnology.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
Processed packaging efficiency data in a previous study were obtained from GitHub (https://github.com/churchlab/AAV_fitness_landscape) to identify insertion sites with potential enhanced packaging efficiency for AAV variants design and correlate with packaging efficiency obtained in this study. The sequencing data generated in this study have been submitted to the NCBI BioProject database under accession number PRJNA834093. Source data are provided with this paper.
Code availability
Scripts used for Oxford Nanopore sequencing data analysis can be accessed at https://github.com/wanglabcumc/MEGAAdt.
References
Bartley, B. A., Beal, J., Karr, J. R. & Strychalski, E. A. Organizing genome engineering for the gigabase scale. Nat. Commun. 11, 689 (2020).
Esvelt, K. M. & Wang, H. H. Genome-scale engineering for systems and synthetic biology. Mol. Syst. Biol. 9, 641 (2013).
Brophy, J. A. & Voigt, C. A. Principles of genetic circuit design. Nat. Methods 11, 508–520 (2014).
Di Blasi, R., Zouein, A., Ellis, T. & Ceroni, F. Genetic toolkits to design and build mammalian synthetic systems. Trends Biotechnol. 39, 1004–1018 (2021).
Ostrov, N. et al. Design, synthesis, and testing toward a 57-codon genome. Science 353, 819–822 (2016).
Fredens, J. et al. Total synthesis of Escherichia coli with a recoded genome. Nature 569, 514–518 (2019).
Mitchell, L. A. et al. Synthesis, debugging, and effects of synthetic chromosome consolidation: synVI and beyond. Science 355, eaaf4831 (2017).
Hutchison, C. A. 3rd et al. Design and synthesis of a minimal bacterial genome. Science 351, aad6253 (2016).
Lajoie, M. J. et al. Genomically recoded organisms expand biological functions. Science 342, 357–360 (2013).
Hoose, A., Vellacott, R., Storch, M., Freemont, P. S. & Ryadnov, M. G. DNA synthesis technologies to close the gene writing gap. Nat. Rev. Chem. 7, 144–161 (2023).
Boeke, J. D. et al. The genome project – write. Science 353, 126–127 (2016).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Blazejewski, T., Ho, H. I. & Wang, H. H. Synthetic sequence entanglement augments stability and containment of genetic information in cells. Science 365, 595–598 (2019).
Sharan, S. K., Thomason, L. C., Kuznetsov, S. G. & Court, D. L. Recombineering: a homologous recombination-based method of genetic engineering. Nat. Protoc. 4, 206–223 (2009).
Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).
Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019).
Liu, H. & Naismith, J. H. An efficient one-step site-directed deletion, insertion, single and multiple-site plasmid mutagenesis protocol. BMC Biotechnol. 8, 91 (2008).
Tseng, W. C., Lin, J. W., Hung, X. G. & Fang, T. Y. Simultaneous mutations up to six distal sites using a phosphorylation-free and ligase-free polymerase chain reaction-based mutagenesis. Anal. Biochem. 401, 315–317 (2010).
Kitzman, J. O., Starita, L. M., Lo, R. S., Fields, S. & Shendure, J. Massively parallel single-amino-acid mutagenesis. Nat. Methods 12, 203–206 (2015).
Cozens, C. & Pinheiro, V. B. Darwin Assembly: fast, efficient, multi-site bespoke mutagenesis. Nucleic Acids Res. 46, e51 (2018).
Wang, H. H. et al. Programming cells by multiplex genome engineering and accelerated evolution. Nature 460, 894–898 (2009).
DiCarlo, J. E. et al. Yeast oligo-mediated genome engineering (YOGE). ACS Synth. Biol. 2, 741–749 (2013).
Lasken, R. S., Schuster, D. M. & Rashtchian, A. Archaebacterial DNA polymerases tightly bind uracil-containing DNA. J. Biol. Chem. 271, 17692–17696 (1996).
Abellan-Schneyder, I., Schusser, A. J. & Neuhaus, K. ddPCR allows 16S rRNA gene amplicon sequencing of very small DNA amounts from low-biomass samples. BMC Microbiol. 21, 349 (2021).
Liu, L. et al. Striking antibody evasion manifested by the Omicron variant of SARS-CoV-2. Nature 602, 676–681 (2022).
Iketani, S. et al. Antibody evasion properties of SARS-CoV-2 Omicron sublineages. Nature 604, 553–556 (2022).
Harvey, W. T. et al. SARS-CoV-2 variants, spike mutations and immune escape. Nat. Rev. Microbiol. 19, 409–424 (2021).
Robertson, W. E. et al. Sense codon reassignment enables viral resistance and encoded polymer synthesis. Science 372, 1057–1062 (2021).
Rovner, A. J. et al. Recoded organisms engineered to depend on synthetic amino acids. Nature 518, 89–93 (2015).
Kuzmin, D. A. et al. The clinical landscape for AAV gene therapies. Nat. Rev. Drug Discov. 20, 173–174 (2021).
Bartel, M. A., Weinstein, J. R. & Schaffer, D. V. Directed evolution of novel adeno-associated viruses for therapeutic gene delivery. Gene Ther. 19, 694–700 (2012).
Ogden, P. J., Kelsic, E. D., Sinai, S. & Church, G. M. Comprehensive AAV capsid fitness landscape reveals a viral gene and enables machine-guided design. Science 366, 1139–1143 (2019).
Qu, G. et al. Separation of adeno-associated virus type 2 empty particles from genome containing vectors by anion-exchange column chromatography. J. Virol. Methods 140, 183–192 (2007).
Hsu, H. L. et al. Structural characterization of a novel human adeno-associated virus capsid with neurotropic properties. Nat. Commun. 11, 3279 (2020).
Bryant, D. H. et al. Deep diversification of an AAV capsid protein by machine learning. Nat. Biotechnol. 39, 691–696 (2021).
Zhu, D. et al. Machine learning-based library design improves packaging and diversity of adeno-associated virus (AAV) libraries. Preprint at bioRxiv https://doi.org/10.1101/2021.11.02.467003 (2021).
Jia, H., Guo, Y., Zhao, W. & Wang, K. Long-range PCR in next-generation sequencing: comparison of six enzymes and evaluation on the MiSeq sequencer. Sci. Rep. 4, 5737 (2014).
Ellis, T., Adie, T. & Baldwin, G. S. DNA assembly for synthetic biology: from parts to pathways and beyond. Integr. Biol. (Camb.) 3, 109–118 (2011).
McDevitt, S., Rusanov, T., Kent, T., Chandramouly, G. & Pomerantz, R. T. How RNA transcripts coordinate DNA recombination and repair. Nat. Commun. 9, 1091 (2018).
Plesa, C., Sidore, A. M., Lubock, N. B., Zhang, D. & Kosuri, S. Multiplexed gene synthesis in emulsions for exploring protein functional landscapes. Science 359, 343–347 (2018).
Yurkovetskiy, L. et al. Structural and functional analysis of the D614G SARS-CoV-2 spike protein variant. Cell 183, 739–751 (2020).
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
Acknowledgements
The authors thank G. Urtecho, J. Qian and L. Huang for technical support, and K. Beiswenger and other members of the Wang laboratory for advice and comments on the manuscript. H.H.W. acknowledges funding support from the NSF (MCB-2032259), DOE (47879/SCW1710), NIH (1R01DK118044, 1R01EB031935, 2R01AI132403, 75N93021C00014), ONR (N00014-17-1-2353), Burroughs Wellcome Fund (1016691), Irma T. Hirschl Trust and Schaefer Research Award.
Author information
Authors and Affiliations
Contributions
L.L. and H.H.W. developed the initial concept; L.L. and Y.H. performed the experiments and analyzed the data with input from H.H.W.; Y.H. developed the software and genomic data analysis pipeline. L.L. and H.H.W. wrote the manuscript. All other authors discussed the results and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
H.H.W. is a scientific advisor of SNIPR Biome, Kingdom Supercultures, Fitbiomics, Arranta Bio, VecX Biomedicines, Genus PLC, and a scientific co-founder of Aclid, all of which are not involved in the study. A patent application on methods described in this paper has been filed by Columbia University. All other authors have no competing interests.
Peer review
Peer review information
Nature Methods thanks Kaihang Wang, Hongzhou Gu, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Lei Tang and Madhura Mukhopadhyay, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–18 and Supplementary Methods.
Supplementary Tables
Supplementary Tables 1–7.
Source data
Source Data Fig. 1
Unprocessed gel images of Fig. 1b and Supplementary Fig. 3b.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, L., Huang, Y. & Wang, H.H. Fast and efficient template-mediated synthesis of genetic variants. Nat Methods 20, 841–848 (2023). https://doi.org/10.1038/s41592-023-01868-1
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41592-023-01868-1
This article is cited by
-
Accelerating primer design for amplicon sequencing using large language model-powered agents
Nature Biomedical Engineering (2025)
-
The design and engineering of synthetic genomes
Nature Reviews Genetics (2025)
-
Oligonucleotide subsets selection by single nucleotide resolution barcode identification
Nature Communications (2025)
-
Template-dependent DNA ligation for the synthesis of modified oligonucleotides
Nature Communications (2024)
-
Controlled enzymatic synthesis of oligonucleotides
Communications Chemistry (2024)