Extended Data Fig. 2: Flow-based sequencing provides predictable error-robust motifs.

A Single-nucleotide variant analysis of matched Ultima and Illumina sequencing datasets across 96 trinucleotide contexts. Cycle shift motifs (described in B) are indicated by plus signs. B Left: Example sequencing of a TGC trinucleotide in flowspace. Given a flow order of T > G > C > A, one full flow cycle of each nucleotide should provide a 1 > 1 > 1 > 0 signal. Top, right: Example of how a T[G > A]C alt disrupts the cycles in flow space basecalling. Two sequencing cycles are required to fully resolve a TAC sequencing motif. We refer to these types of motifs as cycle shift motifs. Bottom, right: Example of how a T[G > C]C variant does not affect the cycles of flow space basecalling. C Error rates in Ultima and Illumina sequencing datasets for trinucleotide variants that alter the flowspace sequencing cycle (n = 120 in the cycle shift motif boxplots (blue), corresponding to the 40 trinucleotide variants that are classified as cycle shift motifs across 3 mouse PDX plasma samples. n = 168 in the non cycle shift motif boxplots (red), corresponding to the 52 trinucleotide variants that are not classified as cycle shift motifs across 3 mouse PDX plasma samples). P-values were measured using a two-sided Wilcoxon test. Error bars in (A) represent the standard error of the mean. For boxplots in (C), the lower and upper ends of boxes represent the 25th and 75th percentiles of the data, respectively, and the horizontal lines represent the median. The whiskers represent at most 1.5 times the IQR.