Fig. 2

Short-read and targeted long-read single cell sequencing of immortalised B and T cell lines. a T-distributed stochastic neighbour embedding (t-SNE) analysis of single cells generated from short-read sequencing data (number of cells: Jurkat = 1463; Ramos = 2000; monocytes = 280). b Demultiplexing statistics for nanopore sequencing reads following targeted capture. Each bar corresponds to the number of nanopore reads per cell barcode identified with short-read sequencing using exact sequence matching. Asterix indicates one cell with over 6000 reads. The number of recovered cell barcodes is shown next to each cell type. “ > 1 barcode” refers to more than one cell barcode found in a single read and “ < 250 nt” refers to any read shorter than 250 nt. c Correlation between Illumina read counts and Oxford Nanopore read counts for T-cell receptor alpha constant gene (TRAC). Each point represents an individual Jurkat cell (n = 472). Pearson correlation = 0.79. d Nanopore read length distribution of demultiplexed reads assigned to each cell type (top panel) compared to the length distribution of assembled contigs that have been assigned a productive receptor chain (bottom panel). Predicted lengths (nt) of mRNA transcripts: Jurkat TRA, 1552 nt; Jurkat TRB, 1,259 nt; Ramos IGH (secreted exons), 1485 nt, Ramos IGH (membrane exons), 1683 nt; Ramos IGL, 932 nt. Predicted lengths obtained from the IMGT database60