Fig. 4: Effect comparison between cDNA-normalized and non-normalized SMS in diversity of captured genes, and in detection of moderately or lowly expressed genes.

a Comparison of full-length high-quality isoforms classified into different subgroups. In normalized or non-normalized libraries, the full-length high-quality isoforms were classified into single-exon and multi-exon groups. The multi-exon isoforms were further subdivided into “ME_canonical” subgroup with all canonical splice sites, and “ME_non-canonical” subgroup with non-canonical splice sites. The single-exon and “ME_canonical” isoforms were together considered as high-fidelity ones. “ME_canonical” isoforms were composed by consensus split-mapped molecules (CSSMs), Non_CSSMs, and novel isoforms. b Comparison of genes covered by high-fidelity isoforms. c Comparison of isoform distribution for “ME_canonical” genes. Genes were binned into three groups based on the composed isoform number (1, 2–4, ≥5) and Chi-square test was performed. d Gene models of the full-length high-quality isoforms detected for an example gene, MAPK3. A non-canonical isoform from normalized libraries, “norm.c2823”, was also shown, in transparency. e Expression comparison for the genes detected in non-normalized vs normalized libraries. Mann–Whitney U test was performed. f Comparison of representative gene clusters with important function and low expression between normalized and non-normalized libraries. lncRNA, long non-coding RNA; TF, transcription factor. The number of genes for each cluster was indicated (left). Expression comparison was performed between lncRNA/TF genes and all genes detected in normalized libraries (right). Mann–Whitney U tests were performed. Data of human peripheral blood samples were used for the analysis. Gene quantification was based on SGS results. For all the statistical tests, *p < 0.05; **p < 0.01; ***p < 0.005.