Supplementary Figure 6: A representative diagnostics report for an amplification batch of mouse ES cells.

(A) Sequencing depth per cell. Shown is a cumulative cell percentage (y-axis) vs. the total number of reads per cell (x-axis). (B) Mapping analysis. Shown are the fractions of reads per cell mapped to exonic loci, spike-ins, or unmapped due to multiple mapping or low MAPQ score. Cells are ordered according to gene mapping fractions. (C) Oligo contamination gauge. Shown are fractions of RT primer, poly A sequences and other oligo sequences within the unmapped reads pool. (D) UMI nucleotide composition. Nucleotide composition (y-axis) for all UMI positions (x-axis). (E-G) Error distributions. Cumulative cell percentage (y-axis) vs. the fraction of molecules that were filtered (x-axis) due to sequencing errors in the UMI (E), cell barcode (F) or template switching errors (G). (H) Negative control wells. The number of unique UMIs mapped to genes (blue) and spike-ins (red) that have a cell barcode associated with four negative control wells. (I) Molecular yield vs. technical efficiency. Shown is the number of detected mouse mRNA molecules (y-axis) vs. the number of spike-ins molecules (x-axis) detected in four wells following sorting of single cells (black dots) and four negative control wells that do not contain single cells (red signs). This visualization highlights potential problems of background noise and failed sorting. (J) Proportion molecules with a single IVT product. The percentage of detected molecules with a single IVT product (single offset; y-axis) vs. the total number of detected molecules (x-axis) per cell (black dots) or in empty wells (red signs). (K-L) Plate visualization. Normalized number of extracted gene (K) or spike-in (L) molecules (blue - low, red - high) in wells (ordered according to the physical plate positions) that were pooled and amplified together (single amplification batch). This visualization allows identification of sorting or robot related problems. (M-N) Molecules per cell. Shown is a cumulative cell percentage (y-axis) vs. the total number of detected gene (M) or spike-in (N) molecules per cell (x-axis). (O) IVT products per molecule. A histogram of the number of IVT products (y-axis, logarithmic scale) per UMI. (P) Reads per molecule. A histogram of the number of reads (y-axis, logarithmic scale) per UMI. (Q) Highly expressed genes. The average number of detected molecules (log2, y-axis) for the 25 genes (x-axis) with the highest expression levels. (R) Highly variable genes. The variance of detected number of molecules divided by the average number of detected molecules (log2, y-axis) for the 25 genes (x-axis) with the highest variance/mean score.