Fig. 3: Genomic distribution of short reads overlapping multiple mutually exclusive defining mutations and recombination breakpoints indicated by them.

a The relationship between defining mutation density and the density of reads overlapping defining mutations of multiple parental strains. All overlapping reads of the 118 analysed samples were considered for this figure. Overlapping read location is defined as the midpoint of the mutually exclusive defining mutations it overlaps. R indicates the Pearson-correlation coefficient, and the corresponding p-value is derived from a two-sided t-test (n = 11 genes). The dashed black line is a least-squares (linear) regression with the grey shaded area marking the 95% confidence interval. b The relationship between the number of recombination breakpoints indicated by overlapping reads and the number of breakpoints identified by Turakhia et al. (with the RIPPLES software) in consensus sequences37. Each point represents a 500 bp region of the genome. R indicates the Pearson-correlation coefficient and the corresponding p-value is derived from a two-sided t-test (n = 60 genomic regions). The dashed black line is a least-squares (linear) regression with the grey shaded area marking the 95% confidence interval. c The average percentage of genomic positions (per sample) for which the ratio of recombinant reads out of all overlapping ones reaches T (threshold). Genomic positions with exactly zero recombinant reads are not shown. Samples were categorized into groups of Delta – Omicron (BA.1) artificial/real and non-Delta – Omicron (BA.1) artificial/real samples. The vertical black line indicates T = 0.1. d Ratio of duplicate reads among recombinant ones for genomic positions with recombinant read ratio lower (n = 30,491 genomic positions across 87 samples) vs. higher (n = 22,378 genomic positions across 87 samples) than 0.1. Genomic positions with exactly zero recombinant reads are not shown. Violin plots and overlayed box plots demonstrate the same distributions, box edges represent the first (Q1) and third (Q3) quartiles, with the inner line showing the median value. Whiskers extend to 1.5-times the interquartile range (IQR = Q3–Q1).