Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Error-corrected flow-based sequencing at whole-genome scale and its application to circulating cell-free DNA profiling

Abstract

Differentiating sequencing errors from true variants is a central genomics challenge, calling for error suppression strategies that balance costs and sensitivity. For example, circulating cell-free DNA (ccfDNA) sequencing for cancer monitoring is limited by sparsity of circulating tumor DNA, abundance of genomic material in samples and preanalytical error rates. Whole-genome sequencing (WGS) can overcome the low abundance of ccfDNA by integrating signals across the mutation landscape, but higher costs limit its wide adoption. Here, we applied deep (~120×) lower-cost WGS (Ultima Genomics) for tumor-informed circulating tumor DNA detection within the part-per-million range. We further leveraged lower-cost sequencing by developing duplex error-corrected WGS of ccfDNA, achieving 7.7 × 10−7 error rates, allowing us to assess disease burden in individuals with melanoma and urothelial cancer without matched tumor sequencing. This error-corrected WGS approach will have broad applicability across genomics, allowing for accurate calling of low-abundance variants at efficient cost and enabling deeper mapping of somatic mosaicism as an emerging central aspect of aging and disease.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Ultralow ctDNA detection requires deep sequencing coverage and low error rates.
Fig. 2: Duplex correction allows ctDNA identification without tumor sequencing.
Fig. 3: Mutational signature analysis of cell-free DNA from individuals with urothelial cancer.

Similar content being viewed by others

Data availability

The raw genomic sequencing data generated are available from the European Genome–Phenome Archive under dataset accession code EGAD50000001234. Datasets obtained from the PCAWGC (Supplementary Table 11) are available at https://www.icgc-argo.org/. Urothelial cancer tumor/normal alignment files were obtained from Nguyen et al.51 and were deposited to dbGap under accession number phs001087.v4.p1.

Code availability

Code and custom scripts are available at https://github.com/alexpcheng/WGSDuplex.

References

  1. Cohen, J. D. et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 359, 926–930 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Sanz-Garcia, E., Zhao, E., Bratman, S. V. & Siu, L. L. Monitoring and adapting cancer treatment using circulating tumor DNA kinetics: current research, opportunities, and challenges. Sci. Adv. 8, eabi8618 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Snyder, M. W., Kircher, M., Hill, A. J., Daza, R. M. & Shendure, J. Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin. Cell 164, 57–68 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Wan, J. C. M. et al. Liquid biopsies come of age: towards implementation of circulating tumour DNA. Nat. Rev. Cancer 17, 223–238 (2017).

    Article  CAS  PubMed  Google Scholar 

  5. Wang, S. et al. Potential clinical significance of a plasma-based KRAS mutation analysis in patients with advanced non-small cell lung cancer. Clin. Cancer Res. 16, 1324–1330 (2010).

    Article  CAS  PubMed  Google Scholar 

  6. Murtaza, M. et al. Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA. Nature 497, 108–112 (2013).

    Article  CAS  PubMed  Google Scholar 

  7. Diehl, F. et al. Circulating mutant DNA to assess tumor dynamics. Nat. Med. 14, 985–990 (2008).

    Article  CAS  PubMed  Google Scholar 

  8. Agarwal, R. et al. Dynamic molecular monitoring reveals that SWI–SNF mutations mediate resistance to ibrutinib plus venetoclax in mantle cell lymphoma. Nat. Med. 25, 119–129 (2019).

    Article  CAS  PubMed  Google Scholar 

  9. Newman, A. M. et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat. Med. 20, 548–554 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Newman, A. M. et al. Integrated digital error suppression for improved detection of circulating tumor DNA. Nat. Biotechnol. 34, 547–555 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Kurtz, D. M. et al. Enhanced detection of minimal residual disease by targeted sequencing of phased variants in circulating tumor DNA. Nat. Biotechnol. 39, 1537–1547 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Cohen, J. D. et al. Detection of low-frequency DNA variants by targeted sequencing of the Watson and Crick strands. Nat. Biotechnol. 39, 1220–1227 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Chaudhuri, A. A. et al. Early detection of molecular residual disease in localized lung cancer by circulating tumor DNA profiling. Cancer Discov. 7, 1394–1403 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Zviran, A. et al. Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring. Nat. Med. 26, 1114–1124 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Abbosh, C. et al. Phylogenetic ctDNA analysis depicts early stage lung cancer evolution. Nature 545, 446–451 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Bettegowda, C. et al. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci. Transl. Med. 6, 224ra24 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Gale, D. et al. Residual ctDNA after treatment predicts early relapse in patients with early-stage non-small cell lung cancer. Ann. Oncol. 33, 500–510 (2022).

    Article  CAS  PubMed  Google Scholar 

  18. Tie, J. et al. Circulating tumor DNA analysis guiding adjuvant therapy in stage II colon cancer. N. Engl. J. Med. 386, 2261–2272 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Hoang, M. L. et al. Genome-wide quantification of rare somatic mutations in normal human tissues using massively parallel sequencing. Proc. Natl Acad. Sci. USA 113, 9846–9851 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Abascal, F. et al. Somatic mutation landscapes at single-molecule resolution. Nature 593, 405–410 (2021).

    Article  CAS  PubMed  Google Scholar 

  22. Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 14508–14513 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Meddeb, R. et al. Quantifying circulating cell-free DNA in humans. Sci. Rep. 9, 5220 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Widman, A. J. et al. Ultrasensitive plasma-based monitoring of tumor burden using machine-learning-guided signal enrichment. Nat. Med. 30, 1655–1666 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. National Human Genome Research Institute. DNA sequencing costs: data. https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Costs-Data (2023).

  26. Almogy, G. et al. Cost-efficient whole genome-sequencing using novel mostly natural sequencing-by-synthesis chemistry and open fluidics platform. Preprint at bioRxiv https://doi.org/10.1101/2022.05.29.493900 (2022).

  27. Simmons, S. K. et al. Mostly natural sequencing-by-synthesis for scRNA-seq using Ultima sequencing. Nat. Biotechnol. 41, 204–211 (2023).

    Article  CAS  PubMed  Google Scholar 

  28. Hasenleithner, S. O. & Speicher, M. R. A clinician’s handbook for using ctDNA throughout the patient journey. Mol. Cancer 21, 81 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Campbell, P. J. et al. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).

    Article  Google Scholar 

  30. Adalsteinsson, V. A. et al. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors. Nat. Commun. 8, 1324 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Rose Brannon, A. et al. Enhanced specificity of clinical high-sensitivity tumor mutation profiling in cell-free DNA via paired normal sequencing using MSK-ACCESS. Nat. Commun. 12, 3770 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Costello, M. et al. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res. 41, e67 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Bratman, S. V. et al. Personalized circulating tumor DNA analysis as a predictive biomarker in solid tumor patients treated with pembrolizumab. Nat. Cancer 1, 873–881 (2020).

    Article  CAS  PubMed  Google Scholar 

  34. Cindy Yang, S. Y. et al. Pan-cancer analysis of longitudinal metastatic tumors reveals genomic alterations and immune landscape dynamics associated with pembrolizumab sensitivity. Nat. Commun. 12, 5137 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Liu, M. H. et al. DNA mismatch and damage patterns revealed by single-molecule sequencing. Nature 630, 752–761 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Bae, J. H. et al. Single duplex DNA sequencing with CODEC detects mutations with high sensitivity. Nat. Genet. 55, 871–879 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Thompson, J. C. et al. Detection of therapeutically targetable driver and resistance mutations in lung cancer patients by next generation sequencing of cell-free circulating tumor DNA. Clin. Cancer Res. 22, 5772–5782 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Hu, Y. et al. False-positive plasma genotyping due to clonal hematopoiesis. Clin. Cancer Res. 24, 4437–4443 (2018).

    Article  CAS  PubMed  Google Scholar 

  39. Abbosh, C., Swanton, C. & Birkbak, N. J. Clonal haematopoiesis: a source of biological noise in cell-free DNA analyses. Ann. Oncol. 30, 358–359 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Shaw, J. A. et al. Serial postoperative circulating tumor DNA assessment has strong prognostic value during long-term follow-up in patients with breast cancer. JCO Precis. Oncol. 8, e2300456 (2024).

    Article  PubMed  Google Scholar 

  41. Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Osorio, F. G. et al. Somatic mutations reveal lineage relationships and age-related mutagenesis in human hematopoiesis. Cell Rep. 25, 2308–2316 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Tate, J. G. et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019).

    Article  CAS  PubMed  Google Scholar 

  45. Jin, H. et al. Accurate and sensitive mutational signature analysis with MuSiCal. Nat. Genet. 56, 541–552 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Tan, L. et al. Prediction and monitoring of relapse in stage III melanoma using circulating tumor DNA. Ann. Oncol. 30, 804–814 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Lee, J. H. et al. Pre-operative ctDNA predicts survival in high-risk stage III cutaneous melanoma patients. Ann. Oncol. 30, 815–822 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Petljak, M. et al. Mechanisms of APOBEC3 mutagenesis in human cancer cells. Nature 607, 799–807 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Findlay, J. M. et al. Differential clonal evolution in oesophageal cancers in response to neo-adjuvant chemotherapy. Nat. Commun. 7, 11111 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Boot, A. et al. In-depth characterization of the cisplatin mutational signature in human cell lines and in esophageal and liver tumors. Genome Res. 28, 654–665 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Nguyen, D. D. et al. The interplay of mutagenesis and ecDNA shapes urothelial cancer evolution. Nature 635, 219–228 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Jiang, H., Lei, R., Ding, S.-W. & Zhu, S. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15, 182 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Novocraft. NovoSort. A multi-threaded sort/merge for BAM files. https://www.novocraft.com/documentation/novosort-2/

  55. Bs, P. & Ar, Q. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).

    Article  Google Scholar 

  56. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10–12 (2011).

    Article  Google Scholar 

  57. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Lai, D., Ha, G. & Shah, S. HMMcopy: copy number prediction with correction for GC and mappability bias for HTS data. Bioconductor version: release (3.15). https://doi.org/10.18129/B9.bioc.HMMcopy (2022).

  59. Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE Blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  60. Kent, W. J. et al. The Human Genome Browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Rosenthal, R., McGranahan, N., Herrero, J., Taylor, B. S. & Swanton, C. deconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 17, 31 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the participants and their families for contributing plasma and tissue for this study. We also thank H. R. He at Weill Cornell, J. Park and all members of the laboratory of D.A.L., the New York Genome Center computational biology team, especially M. Shah, and the New York Genome Center research sequencing laboratory for thoughtful discussions throughout this work. This work was supported by the Mark Foundation Emerging Leader Award, the Vallee Scholar Award, the Burroughs Wellcome Fund Career Award for Medical Scientists, a National Cancer Institute R01 grant (R01-CA266619-01) and the Melanoma Research Alliance Established Investigator Award (D.A.L.). A.P.C. received support from the American Cancer Society Postdoctoral Fellowship program. Memorial Sloan Kettering Cancer Center investigators are supported by Cancer Center Support Grant P30 CA08748 from the National Institutes of Health/National Cancer Institute. A.J.W. received support from the Conquer Cancer Foundation Young Investigator Award, the Melanoma Research Alliance Young Investigator Award and the NCI K08 Mentored Career Scientist Award (K08 CA263301-03). D.A.L. is a Scholar of the Leukemia and Lymphoma Society. This work was made possible by the MacMillan Family Foundation and the MacMillan Center for the Study of the Non-Coding Cancer Genome at the New York Genome Center. The opinions, results and conclusions reported in this paper are those of the authors and are independent from these funding sources.

Author information

Authors and Affiliations

Authors

Contributions

D.A.L., A.P.C., A.J.W., G.B. and B.M.F. conceived and designed the project. D.A.L., G.B. and B.M.F. served as lead principal investigators. A.J.W., A.I.A., M.S.M., A. Saxena, M.K.C., D.T.F., L.S., M.S., J.M., A.K., S.T., D.W., J.O., O.E., J.M.M., N.K.A., J.D.W., M.A.P. G.B. and B.M.F. performed participant selection, curated participant data and prepared samples for sequencing. G.I. provided mouse PDX samples. M.M., D.M., A.H., R.F., J.M., Z.S. and L.W. performed library preparation and sequencing. A.P.C., A.A., I.R., A. Sossin, S.R., N.M., W.F.H., R.M.M., D.H., T.L., H.C., S.G., M.C.Z., N.R., Y.W., A.J. and D.L. performed data analysis. A.P.C., C.P. and D.A.L. wrote the manuscript with comments and contributions from all authors.

Corresponding authors

Correspondence to Alexandre Pellan Cheng or Dan A. Landau.

Ethics declarations

Competing interests

A.P.C. and D.A.L. have filed a provisional patent regarding certain aspects of this manuscript. D.A.L. and A.J.W. have also filed two additional patent applications regarding work presented in this manuscript. A.P.C. is listed as an inventor on submitted patents pertaining to cell-free DNA (US patent applications 63/237,367, 63/056,249, 63/015,095 and 16/500,929) and receives consulting fees from Eurofins Viracor and has received conference travel support from Ultima Genomics. I.R. and A.J. are employees and shareholders of Ultima Genomics. D.L. is a shareholder of Ultima Genomics. G.I. has received consulting fees from Daiichi Sankyo. J.D.W. is a consultant for Apricity, Ascentage Pharma, Bicara Therapeutics, Bristol Myers Squibb, Daiichi Sankyo, Dragonfly, Imvaq, Larkspur, Psioxus, Takeda, Tizona, Trishula Therapeutics, Immunocore – Data Safety board and Scancell; reports grant and research support from Bristol Myers Squibb and Enterome; has equity in Apricity, Arsenal IO/Cell Carta, Ascentage, Imvaq, Linneaus, Georgiamune, Takeda, Tizona Pharmaceuticals and Xenimmune; and is an inventor on the following patents: Xenogeneic DNA Vaccines; Newcastle Disease viruses for Cancer Therapy; Myeloid-derived suppressor cell (MDSC) assay; Prediction of Responsiveness to Treatment with Immunomodulatory Therapeutics and Method of Monitoring Abscopal Effects during such Treatment; Anti-PD1 Antibody; Anti-CTLA4 antibodies; Anti-GITR antibodies and methods of use thereof; CD40 binding molecules and uses thereof. A. Saxena receives research funding from AstraZeneca, has served on Advisory Boards for G1 Therapeutics, Boehringer Ingelheim, Novocure, InxMed, Bristol Myers Squibb and Galvanize Therapeutics, and as a consultant for Galvanize Therapeutics. M.A.P. has received consulting fees from Bristol Myers Squibb, Merck, Novartis, Eisai, Pfizer, Lyvgen and Chugai and has received institutional support from RGenix, Merck Infinity, Bristol Myers Squibb, Merck and Novartis. M.K.C. has received consulting fees from Bristol Myers Squibb, Merck, InCyte, Moderna, ImmunoCore and AstraZeneca and receives institutional support from Bristol Myers Squibb. S.T. is funded by Cancer Research UK (grant reference number A29911); the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC10988), the UK Medical Research Council (FC10988) and the Wellcome Trust (FC10988); the National Institute for Health Research Biomedical Research Centre at the Royal Marsden Hospital and Institute of Cancer Research (grant reference number A109), the Royal Marsden Cancer Charity, The Rosetrees Trust (grant reference number A2204), Ventana Medical Systems (grant reference numbers 10467 and 10530), the National Institute of Health (U01 CA247439) and Melanoma Research Alliance (686061). S.T. has received speaking fees from Roche, AstraZeneca, Novartis and Ipsen. S.T. has the following patents filed: Indel mutations as a therapeutic target and predictive biomarker PCTGB2018/051892 and PCTGB2018/051893. G.B. has sponsored research agreements through her institution with Olink Proteomics, Teiko Bio, InterVenn Biosciences and Palleon Pharmaceuticals; served on advisory boards for Iovance, Merck, Nektar Therapeutics, Novartis and Ankyra Therapeutics; consulted for Merck, InterVenn Biosciences and Ankyra Therapeutics and holds equity in Ankyra Therapeutics. B.M.F. is on the advisory boards for Astrin Bioscience, Natera, Guardant, Janssen, Gilead, Merck, Immunomedics and QED Therapeutics, is a consultant for QED Therapeutics, Astra Biosciences and BostonGene and obtains patent royalties from Immunomedics and Gilead, honoraria from Urotoday and Axiom Healthcare Strategies and research support from Eli Lilly. B.M.F. reports support from the NIH, DoD-CDMRP, Starr Cancer Consortium and the P-1000 Consortium. D.A.L. is on the Scientific Advisory Board of Mission Bio, Pangea, Alethiomics and Veracyte, and has received prior research funding support from Illumina, Ultima Genomics, Celgene, 10x Genomics and Oxford Nanopore Technologies. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Andrew Lawson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Lei Tang and Hui Hua, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Ultima and Illumina sequencing datasets of human-mapped reads in mouse PDX datasets (n = 3).

A Homopolymer size estimation of bases between two PCR duplicates (all samples combined) in Ultima datasets. B Homopolymer size estimation of bases between a read and the aligned reference (all samples combined) in Ultima datasets. C Homopolymer size estimation of bases between two PCR duplicates (all samples combined) in Illumina datasets. D Homopolymer size estimation of bases between a read and the aligned reference (all samples combined) in Illumina datasets. E Indel calling accuracy by PCR duplicate family sizes in Ultima datasets (n = 3 in each boxplot). F Indel calling accuracy of Illumina sequencing reads (for single family reads, n = 3 in each boxplot). G Frequency of homopolymer sizes across the human genome. For boxplots in (E) and (F), the lower and upper ends of boxes represent the 25th and 75th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR. Accuracy in (E) and (F) is defined as the number of correct homopolymer assignments in individual sequencing reads divided by the occurrences of that homopolymer size in the human genome in all sequenced reads.

Extended Data Fig. 2 Flow-based sequencing provides predictable error-robust motifs.

A Single-nucleotide variant analysis of matched Ultima and Illumina sequencing datasets across 96 trinucleotide contexts. Cycle shift motifs (described in B) are indicated by plus signs. B Left: Example sequencing of a TGC trinucleotide in flowspace. Given a flow order of T > G > C > A, one full flow cycle of each nucleotide should provide a 1 > 1 > 1 > 0 signal. Top, right: Example of how a T[G > A]C alt disrupts the cycles in flow space basecalling. Two sequencing cycles are required to fully resolve a TAC sequencing motif. We refer to these types of motifs as cycle shift motifs. Bottom, right: Example of how a T[G > C]C variant does not affect the cycles of flow space basecalling. C Error rates in Ultima and Illumina sequencing datasets for trinucleotide variants that alter the flowspace sequencing cycle (n = 120 in the cycle shift motif boxplots (blue), corresponding to the 40 trinucleotide variants that are classified as cycle shift motifs across 3 mouse PDX plasma samples. n = 168 in the non cycle shift motif boxplots (red), corresponding to the 52 trinucleotide variants that are not classified as cycle shift motifs across 3 mouse PDX plasma samples). P-values were measured using a two-sided Wilcoxon test. Error bars in (A) represent the standard error of the mean. For boxplots in (C), the lower and upper ends of boxes represent the 25th and 75th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR.

Extended Data Fig. 3 Tradeoffs between deep-targeted sequencing and modest whole-genome sequencing for ctDNA detection.

A Mutational burden (number of SNVs) of 22 cancer types retrieved from the Pan Cancer Analysis of Whole Genomes consortium. The numbers along the x-axis represent the number of tumors analyzed per cancer type. B Median ctDNA detection opportunities using a whole-genome approach with 10x sequencing coverage, a 10-target panel at 10,000x coverage and a 1-target panel at 10,000x coverage. The pink shaded area represents tumor types for which targeting only a few sites may offer benefit over whole-genome sequencing. The blue shaded area represents tumor types for which a whole-genome approach will offer more opportunities to detect ctDNA over targeted panels. The lower and upper ends of the boxplots in (A) represent the 25th and 75th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR.

Extended Data Fig. 4 Circulating tumor DNA cost and coverage analysis between Illumina and Ultima sequencing in a matched sample.

Areas under the curve (AUCs) are measured by calculating the area under a receiver operating characteristic curve comparing a given group (for example, Illumina 20x at 10−6 expected tumor fraction) to its platform and coverage-matched cancer-free control (for example, Illumina 20x, expected tumor fraction of 0). All AUCs at expected tumor fractions of 10−4 and greater were 1.00. Z-scores of a given sample are calculated against their coverage and platform matched cancer-free control (expected tumor fraction of 0).

Extended Data Fig. 5 Variant allele frequencies for variants across denoising approaches.

Variant allele frequencies (calculated using unfiltered sequencing reads) in positions where a variant was found using UMI-agnostic denoised reads, Single strand corrected reads and in duplex corrected reads. Allele frequencies of 0.2 and below are colored in red.

Extended Data Fig. 6 Comparison of detected UV-derived mutations using duplex, single-strand and UMI-agnostic denoising methods.

A Cosine similarities by cancer stage at baseline timepoints (pre-treatment or pre-surgery) for UV and CH-associated signatures. B Comparison of duplex, single-strand and UMI-agnostic denoising methods to detect melanoma-associated variants using a single-read variant calling pipeline for pre-treatment plasma samples from melanoma patients (top) and cancer-free controls (bottom). P-values were measured using a two-sided Wilcoxon test. For all boxplots, the lower and upper ends of boxes represent the 25th and 75th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR.

Extended Data Fig. 7 Tumor-agnostic copy-number based tumor fraction estimation in stage III and IV melanoma and cancer-free control samples.

Samples include cancer-free controls (n = 10); stage III melanoma (pre-surgery; n = 10) and stage IV melanoma (pre-treatment; n = 4). Dotted line at 0.03 represents the limit of detection of ichorCNA. For boxplots, the lower and upper ends of boxes represent the 25th and 75th percentiles of the data, respectively, and the horizontal lines represent the median.

Extended Data Fig. 8 ctDNA dynamics throughout treatment in melanoma patients.

A Changes in circulating tumor DNA (increase or decrease) relative to the earliest sampled timepoint. Solid lines represent patients with recurrence or progressive disease, and dashed lines represent patients with either partial response or who are recurrence-free following treatment. Closed and open circles represent samples with and without detected ctDNA, respectively. B Difference in ctDNA relative to the pre-treatment timepoint stratified by clinical outcome. One sample did not have a pre-treatment timepoint available (MEL-15; progressive disease) and so a day 9 post-treatment time point was used as baseline. For boxplots in (B), the lower and upper ends of boxes represent the 25th and 75th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR. P-values were calculated using a two-sided Wilcoxon test.

Extended Data Fig. 9 Major signature contributions from urothelial cancer patients’ tumors measured through whole-genome sequencing.

Top: total mutation counts per sequenced tumor. Bottom: signature contributions. Trinucleotide frequencies were fit to the entire COSMIC database (version v.3.3). When a patient had two or more tumors (B01, B04, B15, B16, B17, B18, B19), we measured signature contributions of mutations that were present in two or more tumors and thereby likely reflect mutations that arise earlier in tumor evolution.

Supplementary information

Supplementary Information

Supplementary Note, Supplementary Figs. 1–12 and references.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–11.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, A.P., Widman, A.J., Arora, A. et al. Error-corrected flow-based sequencing at whole-genome scale and its application to circulating cell-free DNA profiling. Nat Methods 22, 973–981 (2025). https://doi.org/10.1038/s41592-025-02648-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41592-025-02648-9

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer