Extended Data Fig. 1: Ultima and Illumina sequencing datasets of human-mapped reads in mouse PDX datasets (n = 3).

A Homopolymer size estimation of bases between two PCR duplicates (all samples combined) in Ultima datasets. B Homopolymer size estimation of bases between a read and the aligned reference (all samples combined) in Ultima datasets. C Homopolymer size estimation of bases between two PCR duplicates (all samples combined) in Illumina datasets. D Homopolymer size estimation of bases between a read and the aligned reference (all samples combined) in Illumina datasets. E Indel calling accuracy by PCR duplicate family sizes in Ultima datasets (n = 3 in each boxplot). F Indel calling accuracy of Illumina sequencing reads (for single family reads, n = 3 in each boxplot). G Frequency of homopolymer sizes across the human genome. For boxplots in (E) and (F), the lower and upper ends of boxes represent the 25th and 75th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR. Accuracy in (E) and (F) is defined as the number of correct homopolymer assignments in individual sequencing reads divided by the occurrences of that homopolymer size in the human genome in all sequenced reads.