Extended Data Fig. 5: Comparison of read depth and sequence uniqueness across two highly expressed var genes.

(a) Integrated genome viewer (IGV) was used to visualize the distribution of all mapped reads from the HIVE single cell transcriptomes for the 3D7 High Single A (top) and 3D7 High single B (bottom) populations. The distribution of reads across the most highly expressed var gene in each population is shown as ‘Coverage’. The gene model for each gene according to PlasmoDB release 68 is indicated as ‘CDS’. Genes were parsed into 50 bp sliding windows and queried using NCBI blast using the blastn algorithm without low complexity filtering. Heatmaps and line charts indicating levels of similarity were generated using the bit-score of the top, non-self, transcript model for each of these sliding windows. Blue signifies low sequence identity to other genes within the genome, thereby enabling unambiguous mapping of sequencing reads. (b) Proportion of exon1 vs exon2 reads (normalized to exon length) detected in samples obtained from populations with differing numbers of ‘single’ and ‘multiple’ var expression phenotypes (HS: High Single, LM: Low Many). In all cases, exon1 reads meets or exceeded exon2 reads, indicating the transcripts were obtained from mRNAs rather than exon2 associated noncoding RNAs. The greater detection of exon1 reads likely results from the more efficient unique mapping of reads to this portion of the gene, as displayed in (A).