Extended Data Fig. 3: Accurate reconstruction of target transcripts using PROFIT-seq.
From: Real-time and programmable transcriptome sequencing with PROFIT-seq

a, Expression level of target genes in adaptive sampling runs and unmanipulated controls. R = 0.87, P < 10-31, Pearson correlation test. b. Accuracy of consensus reads grouped by the copy number of full-length repetitive cDNA segments. All consensus reads were aligned to the reference genome using minimap2 with option ‘-x splice’, and the accuracy and error rate were calculated from the reported CIGAR values of 500 randomly subsampled reads. The middle lines of the boxes indicate the median and the lower and upper bounds represent the first and third quartiles. The upper and lower whiskers represent the limits of 1.5 inner quantile ranges, and points outside this range are plotted as outliers. c, Percentage of consensus reads generated by the PROFIT-seq script and C3POa (v3.1) using different parameter sets. d, Overlap between the results of PROFIT-seq script and C3POa. e, Per read accuracy of the PROFIT-seq script and C3POa results. Consensus reads were aligned to the reference genome using minimap2 with the option ‘-x splice–cs’, and the per read accuracy was calculated from the reported cs string. f, Run time for the PROFIT-seq script and C3POa. Colors indicates different tools. P = 0.02, Wilcoxon rank sum test. g, The percentage of RCA chimeric reads in PROFIT-seq and C3POa results. All subreads were extracted and aligned to the reference genome using minimap2 with option ‘-x splice’. The RCA chimeric reads were determined if subreads from the same RCA concatemers were aligned to different genomic region with a distance larger than 1 kb. Compared to C3POa, a significant lower chimeric rate of PROFIT-seq was observed in default mode (P = 0.02, Wilcoxon rank sum test). Source numerical data are available in source data.