Error-corrected flow-based sequencing at whole-genome scale and its application to circulating cell-free DNA profiling

Cheng, Alexandre Pellan; Widman, Adam J.; Arora, Anushri; Rusinek, Itai; Sossin, Aaron; Rajagopalan, Srinivas; Midler, Nicholas; Hooper, William F.; Murray, Rebecca M.; Halmos, Daniel; Langanay, Theophile; Chu, Hoyin; Inghirami, Giorgio; Potenski, Catherine; Germer, Soren; Marton, Melissa; Manaa, Dina; Helland, Adrienne; Furatero, Rob; McClintock, Jaime; Winterkorn, Lara; Steinsnyder, Zoe; Wang, Yohyoh; Alimohamed, Asrar I.; Malbari, Murtaza S.; Saxena, Ashish; Callahan, Margaret K.; Frederick, Dennie T.; Spain, Lavinia; Sigouros, Michael; Manohar, Jyothi; King, Abigail; Wilkes, David; Otilano, John; Elemento, Olivier; Mosquera, Juan Miguel; Jaimovich, Ariel; Lipson, Doron; Turajlic, Samra; Zody, Michael C.; Altorki, Nasser K.; Wolchok, Jedd D.; Postow, Michael A.; Robine, Nicolas; Faltas, Bishoy M.; Boland, Genevieve; Landau, Dan A.

doi:10.1038/s41592-025-02648-9

Article
Published: 11 April 2025

Error-corrected flow-based sequencing at whole-genome scale and its application to circulating cell-free DNA profiling

Nature Methods volume 22, pages 973–981 (2025)Cite this article

6673 Accesses
6 Citations
218 Altmetric
Metrics details

Subjects

Abstract

Differentiating sequencing errors from true variants is a central genomics challenge, calling for error suppression strategies that balance costs and sensitivity. For example, circulating cell-free DNA (ccfDNA) sequencing for cancer monitoring is limited by sparsity of circulating tumor DNA, abundance of genomic material in samples and preanalytical error rates. Whole-genome sequencing (WGS) can overcome the low abundance of ccfDNA by integrating signals across the mutation landscape, but higher costs limit its wide adoption. Here, we applied deep (~120×) lower-cost WGS (Ultima Genomics) for tumor-informed circulating tumor DNA detection within the part-per-million range. We further leveraged lower-cost sequencing by developing duplex error-corrected WGS of ccfDNA, achieving 7.7 × 10⁻⁷ error rates, allowing us to assess disease burden in individuals with melanoma and urothelial cancer without matched tumor sequencing. This error-corrected WGS approach will have broad applicability across genomics, allowing for accurate calling of low-abundance variants at efficient cost and enabling deeper mapping of somatic mosaicism as an emerging central aspect of aging and disease.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Ultralow ctDNA detection requires deep sequencing coverage and low error rates.**

**Fig. 2: Duplex correction allows ctDNA identification without tumor sequencing.**

**Fig. 3: Mutational signature analysis of cell-free DNA from individuals with urothelial cancer.**

Genome-wide mutational signatures in low-coverage whole genome sequencing of cell-free DNA

Article Open access 23 August 2022

Enhanced specificity of clinical high-sensitivity tumor mutation profiling in cell-free DNA via paired normal sequencing using MSK-ACCESS

Article Open access 18 June 2021

Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing

Article 09 September 2021

Data availability

The raw genomic sequencing data generated are available from the European Genome–Phenome Archive under dataset accession code EGAD50000001234. Datasets obtained from the PCAWGC (Supplementary Table 11) are available at https://www.icgc-argo.org/. Urothelial cancer tumor/normal alignment files were obtained from Nguyen et al.⁵¹ and were deposited to dbGap under accession number phs001087.v4.p1.

Code availability

Code and custom scripts are available at https://github.com/alexpcheng/WGSDuplex.

References

Cohen, J. D. et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 359, 926–930 (2018).
Article CAS PubMed PubMed Central Google Scholar
Sanz-Garcia, E., Zhao, E., Bratman, S. V. & Siu, L. L. Monitoring and adapting cancer treatment using circulating tumor DNA kinetics: current research, opportunities, and challenges. Sci. Adv. 8, eabi8618 (2022).
Article CAS PubMed PubMed Central Google Scholar
Snyder, M. W., Kircher, M., Hill, A. J., Daza, R. M. & Shendure, J. Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin. Cell 164, 57–68 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wan, J. C. M. et al. Liquid biopsies come of age: towards implementation of circulating tumour DNA. Nat. Rev. Cancer 17, 223–238 (2017).
Article CAS PubMed Google Scholar
Wang, S. et al. Potential clinical significance of a plasma-based KRAS mutation analysis in patients with advanced non-small cell lung cancer. Clin. Cancer Res. 16, 1324–1330 (2010).
Article CAS PubMed Google Scholar
Murtaza, M. et al. Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA. Nature 497, 108–112 (2013).
Article CAS PubMed Google Scholar
Diehl, F. et al. Circulating mutant DNA to assess tumor dynamics. Nat. Med. 14, 985–990 (2008).
Article CAS PubMed Google Scholar
Agarwal, R. et al. Dynamic molecular monitoring reveals that SWI–SNF mutations mediate resistance to ibrutinib plus venetoclax in mantle cell lymphoma. Nat. Med. 25, 119–129 (2019).
Article CAS PubMed Google Scholar
Newman, A. M. et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat. Med. 20, 548–554 (2014).
Article CAS PubMed PubMed Central Google Scholar
Newman, A. M. et al. Integrated digital error suppression for improved detection of circulating tumor DNA. Nat. Biotechnol. 34, 547–555 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kurtz, D. M. et al. Enhanced detection of minimal residual disease by targeted sequencing of phased variants in circulating tumor DNA. Nat. Biotechnol. 39, 1537–1547 (2021).
Article CAS PubMed PubMed Central Google Scholar
Cohen, J. D. et al. Detection of low-frequency DNA variants by targeted sequencing of the Watson and Crick strands. Nat. Biotechnol. 39, 1220–1227 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chaudhuri, A. A. et al. Early detection of molecular residual disease in localized lung cancer by circulating tumor DNA profiling. Cancer Discov. 7, 1394–1403 (2017).
Article CAS PubMed PubMed Central Google Scholar
Zviran, A. et al. Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring. Nat. Med. 26, 1114–1124 (2020).
Article CAS PubMed PubMed Central Google Scholar
Abbosh, C. et al. Phylogenetic ctDNA analysis depicts early stage lung cancer evolution. Nature 545, 446–451 (2017).
Article CAS PubMed PubMed Central Google Scholar
Bettegowda, C. et al. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci. Transl. Med. 6, 224ra24 (2014).
Article PubMed PubMed Central Google Scholar
Gale, D. et al. Residual ctDNA after treatment predicts early relapse in patients with early-stage non-small cell lung cancer. Ann. Oncol. 33, 500–510 (2022).
Article CAS PubMed Google Scholar
Tie, J. et al. Circulating tumor DNA analysis guiding adjuvant therapy in stage II colon cancer. N. Engl. J. Med. 386, 2261–2272 (2022).
Article CAS PubMed PubMed Central Google Scholar
Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).
Article CAS PubMed PubMed Central Google Scholar
Hoang, M. L. et al. Genome-wide quantification of rare somatic mutations in normal human tissues using massively parallel sequencing. Proc. Natl Acad. Sci. USA 113, 9846–9851 (2016).
Article CAS PubMed PubMed Central Google Scholar
Abascal, F. et al. Somatic mutation landscapes at single-molecule resolution. Nature 593, 405–410 (2021).
Article CAS PubMed Google Scholar
Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 14508–14513 (2012).
Article CAS PubMed PubMed Central Google Scholar
Meddeb, R. et al. Quantifying circulating cell-free DNA in humans. Sci. Rep. 9, 5220 (2019).
Article PubMed PubMed Central Google Scholar
Widman, A. J. et al. Ultrasensitive plasma-based monitoring of tumor burden using machine-learning-guided signal enrichment. Nat. Med. 30, 1655–1666 (2024).
Article CAS PubMed PubMed Central Google Scholar
National Human Genome Research Institute. DNA sequencing costs: data. https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data (2023).
Almogy, G. et al. Cost-efficient whole genome-sequencing using novel mostly natural sequencing-by-synthesis chemistry and open fluidics platform. Preprint at bioRxiv https://doi.org/10.1101/2022.05.29.493900 (2022).
Simmons, S. K. et al. Mostly natural sequencing-by-synthesis for scRNA-seq using Ultima sequencing. Nat. Biotechnol. 41, 204–211 (2023).
Article CAS PubMed Google Scholar
Hasenleithner, S. O. & Speicher, M. R. A clinician’s handbook for using ctDNA throughout the patient journey. Mol. Cancer 21, 81 (2022).
Article PubMed PubMed Central Google Scholar
Campbell, P. J. et al. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).
Article Google Scholar
Adalsteinsson, V. A. et al. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors. Nat. Commun. 8, 1324 (2017).
Article PubMed PubMed Central Google Scholar
Rose Brannon, A. et al. Enhanced specificity of clinical high-sensitivity tumor mutation profiling in cell-free DNA via paired normal sequencing using MSK-ACCESS. Nat. Commun. 12, 3770 (2021).
Article CAS PubMed PubMed Central Google Scholar
Costello, M. et al. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res. 41, e67 (2013).
Article CAS PubMed PubMed Central Google Scholar
Bratman, S. V. et al. Personalized circulating tumor DNA analysis as a predictive biomarker in solid tumor patients treated with pembrolizumab. Nat. Cancer 1, 873–881 (2020).
Article CAS PubMed Google Scholar
Cindy Yang, S. Y. et al. Pan-cancer analysis of longitudinal metastatic tumors reveals genomic alterations and immune landscape dynamics associated with pembrolizumab sensitivity. Nat. Commun. 12, 5137 (2021).
Article CAS PubMed PubMed Central Google Scholar
Liu, M. H. et al. DNA mismatch and damage patterns revealed by single-molecule sequencing. Nature 630, 752–761 (2024).
Article CAS PubMed PubMed Central Google Scholar
Bae, J. H. et al. Single duplex DNA sequencing with CODEC detects mutations with high sensitivity. Nat. Genet. 55, 871–879 (2023).
Article CAS PubMed PubMed Central Google Scholar
Thompson, J. C. et al. Detection of therapeutically targetable driver and resistance mutations in lung cancer patients by next generation sequencing of cell-free circulating tumor DNA. Clin. Cancer Res. 22, 5772–5782 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hu, Y. et al. False-positive plasma genotyping due to clonal hematopoiesis. Clin. Cancer Res. 24, 4437–4443 (2018).
Article CAS PubMed Google Scholar
Abbosh, C., Swanton, C. & Birkbak, N. J. Clonal haematopoiesis: a source of biological noise in cell-free DNA analyses. Ann. Oncol. 30, 358–359 (2019).
Article CAS PubMed PubMed Central Google Scholar
Shaw, J. A. et al. Serial postoperative circulating tumor DNA assessment has strong prognostic value during long-term follow-up in patients with breast cancer. JCO Precis. Oncol. 8, e2300456 (2024).
Article PubMed Google Scholar
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Article CAS PubMed PubMed Central Google Scholar
Osorio, F. G. et al. Somatic mutations reveal lineage relationships and age-related mutagenesis in human hematopoiesis. Cell Rep. 25, 2308–2316 (2018).
Article CAS PubMed PubMed Central Google Scholar
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Article CAS PubMed PubMed Central Google Scholar
Tate, J. G. et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019).
Article CAS PubMed Google Scholar
Jin, H. et al. Accurate and sensitive mutational signature analysis with MuSiCal. Nat. Genet. 56, 541–552 (2024).
Article CAS PubMed PubMed Central Google Scholar
Tan, L. et al. Prediction and monitoring of relapse in stage III melanoma using circulating tumor DNA. Ann. Oncol. 30, 804–814 (2019).
Article CAS PubMed PubMed Central Google Scholar
Lee, J. H. et al. Pre-operative ctDNA predicts survival in high-risk stage III cutaneous melanoma patients. Ann. Oncol. 30, 815–822 (2019).
Article CAS PubMed PubMed Central Google Scholar
Petljak, M. et al. Mechanisms of APOBEC3 mutagenesis in human cancer cells. Nature 607, 799–807 (2022).
Article CAS PubMed PubMed Central Google Scholar
Findlay, J. M. et al. Differential clonal evolution in oesophageal cancers in response to neo-adjuvant chemotherapy. Nat. Commun. 7, 11111 (2016).
Article CAS PubMed PubMed Central Google Scholar
Boot, A. et al. In-depth characterization of the cisplatin mutational signature in human cell lines and in esophageal and liver tumors. Genome Res. 28, 654–665 (2018).
Article CAS PubMed PubMed Central Google Scholar
Nguyen, D. D. et al. The interplay of mutagenesis and ecDNA shapes urothelial cancer evolution. Nature 635, 219–228 (2024).
Article CAS PubMed PubMed Central Google Scholar
Jiang, H., Lei, R., Ding, S.-W. & Zhu, S. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15, 182 (2014).
Article PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Novocraft. NovoSort. A multi-threaded sort/merge for BAM files. https://www.novocraft.com/documentation/novosort-2/
Bs, P. & Ar, Q. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).
Article Google Scholar
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).
Article Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Article CAS PubMed PubMed Central Google Scholar
Lai, D., Ha, G. & Shah, S. HMMcopy: copy number prediction with correction for GC and mappability bias for HTS data. Bioconductor version: release (3.15). https://doi.org/10.18129/B9.bioc.HMMcopy (2022).
Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE Blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).
Article PubMed PubMed Central Google Scholar
Kent, W. J. et al. The Human Genome Browser at UCSC. Genome Res. 12, 996–1006 (2002).
Article CAS PubMed PubMed Central Google Scholar
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rosenthal, R., McGranahan, N., Herrero, J., Taylor, B. S. & Swanton, C. deconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 17, 31 (2016).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank the participants and their families for contributing plasma and tissue for this study. We also thank H. R. He at Weill Cornell, J. Park and all members of the laboratory of D.A.L., the New York Genome Center computational biology team, especially M. Shah, and the New York Genome Center research sequencing laboratory for thoughtful discussions throughout this work. This work was supported by the Mark Foundation Emerging Leader Award, the Vallee Scholar Award, the Burroughs Wellcome Fund Career Award for Medical Scientists, a National Cancer Institute R01 grant (R01-CA266619-01) and the Melanoma Research Alliance Established Investigator Award (D.A.L.). A.P.C. received support from the American Cancer Society Postdoctoral Fellowship program. Memorial Sloan Kettering Cancer Center investigators are supported by Cancer Center Support Grant P30 CA08748 from the National Institutes of Health/National Cancer Institute. A.J.W. received support from the Conquer Cancer Foundation Young Investigator Award, the Melanoma Research Alliance Young Investigator Award and the NCI K08 Mentored Career Scientist Award (K08 CA263301-03). D.A.L. is a Scholar of the Leukemia and Lymphoma Society. This work was made possible by the MacMillan Family Foundation and the MacMillan Center for the Study of the Non-Coding Cancer Genome at the New York Genome Center. The opinions, results and conclusions reported in this paper are those of the authors and are independent from these funding sources.

Author information

These authors contributed equally: Alexandre Pellan Cheng, Adam J. Widman, Anushri Arora.
These authors jointly supervised this work: Bishoy M. Faltas, Genevieve Boland, Dan A. Landau,

Authors and Affiliations

New York Genome Center, New York, NY, USA
Alexandre Pellan Cheng, Adam J. Widman, Anushri Arora, Aaron Sossin, Srinivas Rajagopalan, Nicholas Midler, William F. Hooper, Rebecca M. Murray, Daniel Halmos, Theophile Langanay, Catherine Potenski, Soren Germer, Melissa Marton, Dina Manaa, Adrienne Helland, Rob Furatero, Jaime McClintock, Lara Winterkorn, Zoe Steinsnyder, Yohyoh Wang, Michael C. Zody, Nicolas Robine & Dan A. Landau
Sandra and Edward Meyer Cancer Center, Weill Cornell Medicine, Weill Cornell Medical College, New York, NY, USA
Alexandre Pellan Cheng, Anushri Arora, Aaron Sossin, Srinivas Rajagopalan, Nicholas Midler, Rebecca M. Murray, Daniel Halmos, Theophile Langanay, Hoyin Chu, Giorgio Inghirami, Catherine Potenski, Yohyoh Wang, Murtaza S. Malbari, Ashish Saxena, Nasser K. Altorki, Jedd D. Wolchok, Michael A. Postow, Bishoy M. Faltas & Dan A. Landau
Département de Génie des Systèmes, École de Technologie Supérieure, Montréal, Québec, Canada
Alexandre Pellan Cheng
Axe Cancer, Centre de Recherche du Centre Hospitalier de l’Université de Montréal (CRCHUM), Montréal, Québec, Canada
Alexandre Pellan Cheng
Memorial Sloan Kettering Cancer Center, New York, NY, USA
Adam J. Widman, Margaret K. Callahan & Michael A. Postow
Ultima Genomics, Fremont, CA, USA
Itai Rusinek, Ariel Jaimovich & Doron Lipson
Mass General Cancer Center, Massachusetts General Hospital, Boston, MA, USA
Asrar I. Alimohamed, Dennie T. Frederick & Genevieve Boland
Cancer Dynamics Laboratory, The Francis Crick Institute, London, UK
Lavinia Spain & Samra Turajlic
Renal and Skin Unit, The Royal Marsden NHS Foundation Trust, London, UK
Lavinia Spain & Samra Turajlic
Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA
Michael Sigouros, Jyothi Manohar, Abigail King, David Wilkes, John Otilano, Olivier Elemento, Juan Miguel Mosquera & Bishoy M. Faltas
Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
Olivier Elemento & Juan Miguel Mosquera
Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
Juan Miguel Mosquera
Parker Institute for Cancer Immunotherapy, San Francisco, CA, USA
Jedd D. Wolchok
Ludwig Institute for Cancer Research, New York, NY, USA
Jedd D. Wolchok
Department of Cell and Developmental Biology, Weill Cornell Medicine, New York, NY, USA
Bishoy M. Faltas

Authors

Alexandre Pellan Cheng
View author publications
Search author on:PubMed Google Scholar
Adam J. Widman
View author publications
Search author on:PubMed Google Scholar
Anushri Arora
View author publications
Search author on:PubMed Google Scholar
Itai Rusinek
View author publications
Search author on:PubMed Google Scholar
Aaron Sossin
View author publications
Search author on:PubMed Google Scholar
Srinivas Rajagopalan
View author publications
Search author on:PubMed Google Scholar
Nicholas Midler
View author publications
Search author on:PubMed Google Scholar
William F. Hooper
View author publications
Search author on:PubMed Google Scholar
Rebecca M. Murray
View author publications
Search author on:PubMed Google Scholar
Daniel Halmos
View author publications
Search author on:PubMed Google Scholar
Theophile Langanay
View author publications
Search author on:PubMed Google Scholar
Hoyin Chu
View author publications
Search author on:PubMed Google Scholar
Giorgio Inghirami
View author publications
Search author on:PubMed Google Scholar
Catherine Potenski
View author publications
Search author on:PubMed Google Scholar
Soren Germer
View author publications
Search author on:PubMed Google Scholar
Melissa Marton
View author publications
Search author on:PubMed Google Scholar
Dina Manaa
View author publications
Search author on:PubMed Google Scholar
Adrienne Helland
View author publications
Search author on:PubMed Google Scholar
Rob Furatero
View author publications
Search author on:PubMed Google Scholar
Jaime McClintock
View author publications
Search author on:PubMed Google Scholar
Lara Winterkorn
View author publications
Search author on:PubMed Google Scholar
Zoe Steinsnyder
View author publications
Search author on:PubMed Google Scholar
Yohyoh Wang
View author publications
Search author on:PubMed Google Scholar
Asrar I. Alimohamed
View author publications
Search author on:PubMed Google Scholar
Murtaza S. Malbari
View author publications
Search author on:PubMed Google Scholar
Ashish Saxena
View author publications
Search author on:PubMed Google Scholar
Margaret K. Callahan
View author publications
Search author on:PubMed Google Scholar
Dennie T. Frederick
View author publications
Search author on:PubMed Google Scholar
Lavinia Spain
View author publications
Search author on:PubMed Google Scholar
Michael Sigouros
View author publications
Search author on:PubMed Google Scholar
Jyothi Manohar
View author publications
Search author on:PubMed Google Scholar
Abigail King
View author publications
Search author on:PubMed Google Scholar
David Wilkes
View author publications
Search author on:PubMed Google Scholar
John Otilano
View author publications
Search author on:PubMed Google Scholar
Olivier Elemento
View author publications
Search author on:PubMed Google Scholar
Juan Miguel Mosquera
View author publications
Search author on:PubMed Google Scholar
Ariel Jaimovich
View author publications
Search author on:PubMed Google Scholar
Doron Lipson
View author publications
Search author on:PubMed Google Scholar
Samra Turajlic
View author publications
Search author on:PubMed Google Scholar
Michael C. Zody
View author publications
Search author on:PubMed Google Scholar
Nasser K. Altorki
View author publications
Search author on:PubMed Google Scholar
Jedd D. Wolchok
View author publications
Search author on:PubMed Google Scholar
Michael A. Postow
View author publications
Search author on:PubMed Google Scholar
Nicolas Robine
View author publications
Search author on:PubMed Google Scholar
Bishoy M. Faltas
View author publications
Search author on:PubMed Google Scholar
Genevieve Boland
View author publications
Search author on:PubMed Google Scholar
Dan A. Landau
View author publications
Search author on:PubMed Google Scholar

Contributions

D.A.L., A.P.C., A.J.W., G.B. and B.M.F. conceived and designed the project. D.A.L., G.B. and B.M.F. served as lead principal investigators. A.J.W., A.I.A., M.S.M., A. Saxena, M.K.C., D.T.F., L.S., M.S., J.M., A.K., S.T., D.W., J.O., O.E., J.M.M., N.K.A., J.D.W., M.A.P. G.B. and B.M.F. performed participant selection, curated participant data and prepared samples for sequencing. G.I. provided mouse PDX samples. M.M., D.M., A.H., R.F., J.M., Z.S. and L.W. performed library preparation and sequencing. A.P.C., A.A., I.R., A. Sossin, S.R., N.M., W.F.H., R.M.M., D.H., T.L., H.C., S.G., M.C.Z., N.R., Y.W., A.J. and D.L. performed data analysis. A.P.C., C.P. and D.A.L. wrote the manuscript with comments and contributions from all authors.

Corresponding authors

Correspondence to Alexandre Pellan Cheng or Dan A. Landau.

Ethics declarations

Competing interests

A.P.C. and D.A.L. have filed a provisional patent regarding certain aspects of this manuscript. D.A.L. and A.J.W. have also filed two additional patent applications regarding work presented in this manuscript. A.P.C. is listed as an inventor on submitted patents pertaining to cell-free DNA (US patent applications 63/237,367, 63/056,249, 63/015,095 and 16/500,929) and receives consulting fees from Eurofins Viracor and has received conference travel support from Ultima Genomics. I.R. and A.J. are employees and shareholders of Ultima Genomics. D.L. is a shareholder of Ultima Genomics. G.I. has received consulting fees from Daiichi Sankyo. J.D.W. is a consultant for Apricity, Ascentage Pharma, Bicara Therapeutics, Bristol Myers Squibb, Daiichi Sankyo, Dragonfly, Imvaq, Larkspur, Psioxus, Takeda, Tizona, Trishula Therapeutics, Immunocore – Data Safety board and Scancell; reports grant and research support from Bristol Myers Squibb and Enterome; has equity in Apricity, Arsenal IO/Cell Carta, Ascentage, Imvaq, Linneaus, Georgiamune, Takeda, Tizona Pharmaceuticals and Xenimmune; and is an inventor on the following patents: Xenogeneic DNA Vaccines; Newcastle Disease viruses for Cancer Therapy; Myeloid-derived suppressor cell (MDSC) assay; Prediction of Responsiveness to Treatment with Immunomodulatory Therapeutics and Method of Monitoring Abscopal Effects during such Treatment; Anti-PD1 Antibody; Anti-CTLA4 antibodies; Anti-GITR antibodies and methods of use thereof; CD40 binding molecules and uses thereof. A. Saxena receives research funding from AstraZeneca, has served on Advisory Boards for G1 Therapeutics, Boehringer Ingelheim, Novocure, InxMed, Bristol Myers Squibb and Galvanize Therapeutics, and as a consultant for Galvanize Therapeutics. M.A.P. has received consulting fees from Bristol Myers Squibb, Merck, Novartis, Eisai, Pfizer, Lyvgen and Chugai and has received institutional support from RGenix, Merck Infinity, Bristol Myers Squibb, Merck and Novartis. M.K.C. has received consulting fees from Bristol Myers Squibb, Merck, InCyte, Moderna, ImmunoCore and AstraZeneca and receives institutional support from Bristol Myers Squibb. S.T. is funded by Cancer Research UK (grant reference number A29911); the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC10988), the UK Medical Research Council (FC10988) and the Wellcome Trust (FC10988); the National Institute for Health Research Biomedical Research Centre at the Royal Marsden Hospital and Institute of Cancer Research (grant reference number A109), the Royal Marsden Cancer Charity, The Rosetrees Trust (grant reference number A2204), Ventana Medical Systems (grant reference numbers 10467 and 10530), the National Institute of Health (U01 CA247439) and Melanoma Research Alliance (686061). S.T. has received speaking fees from Roche, AstraZeneca, Novartis and Ipsen. S.T. has the following patents filed: Indel mutations as a therapeutic target and predictive biomarker PCTGB2018/051892 and PCTGB2018/051893. G.B. has sponsored research agreements through her institution with Olink Proteomics, Teiko Bio, InterVenn Biosciences and Palleon Pharmaceuticals; served on advisory boards for Iovance, Merck, Nektar Therapeutics, Novartis and Ankyra Therapeutics; consulted for Merck, InterVenn Biosciences and Ankyra Therapeutics and holds equity in Ankyra Therapeutics. B.M.F. is on the advisory boards for Astrin Bioscience, Natera, Guardant, Janssen, Gilead, Merck, Immunomedics and QED Therapeutics, is a consultant for QED Therapeutics, Astra Biosciences and BostonGene and obtains patent royalties from Immunomedics and Gilead, honoraria from Urotoday and Axiom Healthcare Strategies and research support from Eli Lilly. B.M.F. reports support from the NIH, DoD-CDMRP, Starr Cancer Consortium and the P-1000 Consortium. D.A.L. is on the Scientific Advisory Board of Mission Bio, Pangea, Alethiomics and Veracyte, and has received prior research funding support from Illumina, Ultima Genomics, Celgene, 10x Genomics and Oxford Nanopore Technologies. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Andrew Lawson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Lei Tang and Hui Hua, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Ultima and Illumina sequencing datasets of human-mapped reads in mouse PDX datasets (n = 3).

A Homopolymer size estimation of bases between two PCR duplicates (all samples combined) in Ultima datasets. B Homopolymer size estimation of bases between a read and the aligned reference (all samples combined) in Ultima datasets. C Homopolymer size estimation of bases between two PCR duplicates (all samples combined) in Illumina datasets. D Homopolymer size estimation of bases between a read and the aligned reference (all samples combined) in Illumina datasets. E Indel calling accuracy by PCR duplicate family sizes in Ultima datasets (n = 3 in each boxplot). F Indel calling accuracy of Illumina sequencing reads (for single family reads, n = 3 in each boxplot). G Frequency of homopolymer sizes across the human genome. For boxplots in (E) and (F), the lower and upper ends of boxes represent the 25^th and 75^th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR. Accuracy in (E) and (F) is defined as the number of correct homopolymer assignments in individual sequencing reads divided by the occurrences of that homopolymer size in the human genome in all sequenced reads.

Extended Data Fig. 2 Flow-based sequencing provides predictable error-robust motifs.

A Single-nucleotide variant analysis of matched Ultima and Illumina sequencing datasets across 96 trinucleotide contexts. Cycle shift motifs (described in B) are indicated by plus signs. B Left: Example sequencing of a TGC trinucleotide in flowspace. Given a flow order of T > G > C > A, one full flow cycle of each nucleotide should provide a 1 > 1 > 1 > 0 signal. Top, right: Example of how a T[G > A]C alt disrupts the cycles in flow space basecalling. Two sequencing cycles are required to fully resolve a TAC sequencing motif. We refer to these types of motifs as cycle shift motifs. Bottom, right: Example of how a T[G > C]C variant does not affect the cycles of flow space basecalling. C Error rates in Ultima and Illumina sequencing datasets for trinucleotide variants that alter the flowspace sequencing cycle (n = 120 in the cycle shift motif boxplots (blue), corresponding to the 40 trinucleotide variants that are classified as cycle shift motifs across 3 mouse PDX plasma samples. n = 168 in the non cycle shift motif boxplots (red), corresponding to the 52 trinucleotide variants that are not classified as cycle shift motifs across 3 mouse PDX plasma samples). P-values were measured using a two-sided Wilcoxon test. Error bars in (A) represent the standard error of the mean. For boxplots in (C), the lower and upper ends of boxes represent the 25^th and 75^th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR.

Extended Data Fig. 3 Tradeoffs between deep-targeted sequencing and modest whole-genome sequencing for ctDNA detection.

A Mutational burden (number of SNVs) of 22 cancer types retrieved from the Pan Cancer Analysis of Whole Genomes consortium. The numbers along the x-axis represent the number of tumors analyzed per cancer type. B Median ctDNA detection opportunities using a whole-genome approach with 10x sequencing coverage, a 10-target panel at 10,000x coverage and a 1-target panel at 10,000x coverage. The pink shaded area represents tumor types for which targeting only a few sites may offer benefit over whole-genome sequencing. The blue shaded area represents tumor types for which a whole-genome approach will offer more opportunities to detect ctDNA over targeted panels. The lower and upper ends of the boxplots in (A) represent the 25^th and 75^th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR.

Extended Data Fig. 4 Circulating tumor DNA cost and coverage analysis between Illumina and Ultima sequencing in a matched sample.

Areas under the curve (AUCs) are measured by calculating the area under a receiver operating characteristic curve comparing a given group (for example, Illumina 20x at 10⁻⁶ expected tumor fraction) to its platform and coverage-matched cancer-free control (for example, Illumina 20x, expected tumor fraction of 0). All AUCs at expected tumor fractions of 10⁻⁴ and greater were 1.00. Z-scores of a given sample are calculated against their coverage and platform matched cancer-free control (expected tumor fraction of 0).

Extended Data Fig. 5 Variant allele frequencies for variants across denoising approaches.

Variant allele frequencies (calculated using unfiltered sequencing reads) in positions where a variant was found using UMI-agnostic denoised reads, Single strand corrected reads and in duplex corrected reads. Allele frequencies of 0.2 and below are colored in red.

Extended Data Fig. 6 Comparison of detected UV-derived mutations using duplex, single-strand and UMI-agnostic denoising methods.

A Cosine similarities by cancer stage at baseline timepoints (pre-treatment or pre-surgery) for UV and CH-associated signatures. B Comparison of duplex, single-strand and UMI-agnostic denoising methods to detect melanoma-associated variants using a single-read variant calling pipeline for pre-treatment plasma samples from melanoma patients (top) and cancer-free controls (bottom). P-values were measured using a two-sided Wilcoxon test. For all boxplots, the lower and upper ends of boxes represent the 25^th and 75^th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR.

Extended Data Fig. 7 Tumor-agnostic copy-number based tumor fraction estimation in stage III and IV melanoma and cancer-free control samples.

Samples include cancer-free controls (n = 10); stage III melanoma (pre-surgery; n = 10) and stage IV melanoma (pre-treatment; n = 4). Dotted line at 0.03 represents the limit of detection of ichorCNA. For boxplots, the lower and upper ends of boxes represent the 25^th and 75^th percentiles of the data, respectively, and the horizontal lines represent the median.

Extended Data Fig. 8 ctDNA dynamics throughout treatment in melanoma patients.

A Changes in circulating tumor DNA (increase or decrease) relative to the earliest sampled timepoint. Solid lines represent patients with recurrence or progressive disease, and dashed lines represent patients with either partial response or who are recurrence-free following treatment. Closed and open circles represent samples with and without detected ctDNA, respectively. B Difference in ctDNA relative to the pre-treatment timepoint stratified by clinical outcome. One sample did not have a pre-treatment timepoint available (MEL-15; progressive disease) and so a day 9 post-treatment time point was used as baseline. For boxplots in (B), the lower and upper ends of boxes represent the 25^th and 75^th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR. P-values were calculated using a two-sided Wilcoxon test.

Extended Data Fig. 9 Major signature contributions from urothelial cancer patients’ tumors measured through whole-genome sequencing.

Top: total mutation counts per sequenced tumor. Bottom: signature contributions. Trinucleotide frequencies were fit to the entire COSMIC database (version v.3.3). When a patient had two or more tumors (B01, B04, B15, B16, B17, B18, B19), we measured signature contributions of mutations that were present in two or more tumors and thereby likely reflect mutations that arise earlier in tumor evolution.

Supplementary information

Supplementary Information

Supplementary Note, Supplementary Figs. 1–12 and references.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–11.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cheng, A.P., Widman, A.J., Arora, A. et al. Error-corrected flow-based sequencing at whole-genome scale and its application to circulating cell-free DNA profiling. Nat Methods 22, 973–981 (2025). https://doi.org/10.1038/s41592-025-02648-9

Download citation

Received: 21 November 2022
Accepted: 04 March 2025
Published: 11 April 2025
Issue date: May 2025
DOI: https://doi.org/10.1038/s41592-025-02648-9

Error-corrected flow-based sequencing at whole-genome scale and its application to circulating cell-free DNA profiling

Subjects

Abstract

Access options

Similar content being viewed by others

Genome-wide mutational signatures in low-coverage whole genome sequencing of cell-free DNA

Enhanced specificity of clinical high-sensitivity tumor mutation profiling in cell-free DNA via paired normal sequencing using MSK-ACCESS

Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Extended Data Fig. 1 Ultima and Illumina sequencing datasets of human-mapped reads in mouse PDX datasets (n = 3).

Extended Data Fig. 2 Flow-based sequencing provides predictable error-robust motifs.

Extended Data Fig. 3 Tradeoffs between deep-targeted sequencing and modest whole-genome sequencing for ctDNA detection.

Extended Data Fig. 4 Circulating tumor DNA cost and coverage analysis between Illumina and Ultima sequencing in a matched sample.

Extended Data Fig. 5 Variant allele frequencies for variants across denoising approaches.

Extended Data Fig. 6 Comparison of detected UV-derived mutations using duplex, single-strand and UMI-agnostic denoising methods.

Extended Data Fig. 7 Tumor-agnostic copy-number based tumor fraction estimation in stage III and IV melanoma and cancer-free control samples.

Extended Data Fig. 8 ctDNA dynamics throughout treatment in melanoma patients.

Extended Data Fig. 9 Major signature contributions from urothelial cancer patients’ tumors measured through whole-genome sequencing.

Supplementary information

Supplementary Information

Reporting Summary

Supplementary Tables

Rights and permissions

About this article

Cite this article

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links