Supplementary Figure 6: Comparison of three replicates versus single replicate and RNA-seq versus RNAtaq-seq.

(a) Scatterplots comparing WT-like strains from each of three major subclades (10 SC1A, 22 SC1B, 14 SC2A) using triplicates versus one randomly selected replicate from the 50-strain data. Presence of three biological replicates in the 50-strain data allowed us to simulate comparisons of averaged normalized counts when three versus one replicate were used. Strong correlation (r = 0.99) was observed for each triplicate- versus single-replicate comparison. Pearson correlation coefficient (r) is shown for each comparison. n represents number of samples (number of strains multiplied by number of replicates). (b) Seven strains were processed using the two protocols, that is, RNA-seq (three biological replicates per strain) and RNAtag-seq (singletons, that is, using single replicates). Principal component analysis of the seven strains processed using RNA-seq (three spheres colored cyan in the PCA plot) and RNAtag-seq (single sphere colored red in the PCA plot) displays overlapping spatial clustering. Expression profile of the 7 strains in the PCA plot is circled and numbered 1 through 7. Strains analyzed: 1-MGAS7888, 2-MGAS29284, 3-MGAS29553, 4-MGAS28746, 5-MGAS7914, 6-MGAS28647, and 7-MGAS28686. (c) Scatterplots were generated for the normalized counts (log-transformed) from the aforementioned seven strains processed using the two protocols, that is, RNA-seq and RNAtag-seq. For each strain, normalized transcript counts were averaged over the three biological replicates (RNA-seq protocol) and compared to RNAtag-seq normalized counts (singleton strain samples). Pearson correlation coefficient (r) is shown for each comparison.