Extended Data Fig. 1: Sequencing metrics and tumor fraction estimation.

a. DNA yield, median coverage, and tumor fraction (TF) estimates derived from targeted sequencing. Numbers of independent samples per category annotated in brackets. b. Hierarchical approach for TF estimation. Mutations deemed ineligible for TF estimation in Group 1 (that is mutations not shared across most same-patient samples, and are therefore potentially subclonal) can underestimate TF compared to heterozygous single-nucleotide polymorphism (SNP) allele frequency applied in Group 2 (middle scatter plot). c. Comparison of TF by targeted (left) and whole-exome sequencing (WES; right) sequencing versus pathology tumor cellularity (TC; n = 521). Bars show the fraction of samples with higher pathology-derived than sequencing-derived TF estimates. d. Sequencing-derived TF estimates versus median targeted sequencing coverage (for all tissue samples; n = 523). e. High median coverage of all ISUP grade 5 samples with sequencing TF of 0% indicating that sequencing insufficiencies do not explain low tumor fraction results. f. Sequencing-derived TF estimates for each tissue sample (n are annotated), grouped by patient along the x-axis. ISUP (International Society of Urological Pathology) high and low refer to primary grade ≥4 and <4, respectively). Asterisks denote patients who received preoperative androgen-deprivation therapy (ADT). g. Variant allele frequencies of somatic mutations by targeted sequencing and whole-exome sequencing (WES) are highly correlated. Each dot represents a mutation detected via either sequencing method with jointly sufficient locus-specific depth (n = 234). Mutations with <30x coverage by either method are excluded. h. Coverage log ratio (LR) correlation between 73-gene targeted panel and WES (in samples sequenced with both modalities; n = 152). Each dot (n = 11271) represents a gene shared across panels. i. Few somatic mutations detected solely by WES are within established cancer genes. Of 7757 mutations detected in genes not covered by the targeted sequencing panel, only six were pathogenic coding mutations in key cancer driver genes (moving left to right, each subsequent bar is a subset of the previous bar). cfDNA: cell-free DNA, FFPE: formalin-fixed paraffin-embedded, MLN: metastatic lymph node, PB: prostate biopsy, RP: radical prostatectomy.