Fig. 2: Comparison of file sizes of SARS-CoV-2 genome alignments using different alignment formats.

On the y axis on a logarithmic scale, we show the sizes of alignment files for each format considered, expressed in bytes. On the x axis is the number of sequences in the dataset considered on a logarithmic scale. Here we consider random subsamples of our real SARS-COV-2 alignment data. Violin plots (often variation within one plot is not visible, collapsing the violin plots into horizontal lines) summarize values for 20 replicates, and dots represent their mean.