Fig. 2: Development and validation of gene expression calling pipelines.
From: Clinical and analytical validation of a combined RNA and DNA exome assay across a large tumor cohort

Gene expression in DNA-contaminated samples (a), sense strand reads proportion and DNA contamination (b), before and after DNase treatment. Correlation of measured TPMs for ERCC spike-ins (n = 89) with known molar concentrations in tumor (c) and cell line (d) samples. Colors indicate four transcript groups in the mix (red, pink, blue, gray). e Schema of orthogonal validation of gene expressions. WTS - whole transcriptome sequencing (f) Distribution of Pearson’s correlation coefficients for single gene expression (n = 1389) between FF and FFPE samples of the same tissue. Genes with wide dynamic range (STD > 2) were used. Shading shows 95% confidence interval of correlation coefficient. g Correlation between RNA-seq (log2TPM) and qPCR (ΔCt) measurements for 4 genes (n = 57). h Distribution of Pearson’s correlation coefficients for single gene expressions (n = 99) measured by RNA-seq (TPM) or qPCR. i Intra-day reproducibility of gene expressions (TPM) measured in the same sample; r = 0.99, p = 1.0e-308. j Gene expression CV and TPM across 10 clinical samples measured on 3 different days. k Median CV of gene expressions (n = 20,062) and signatures (n = 21) for TPM > 1. Box plots show the interquartile range. l Gene expression stability across COLO829 cell line ranges over 4 months; expressions from the same FFPE block preparation remained stable. m Schema of TME subtypes classification. n Median variance for signature score reproducibility for inter- and intra-day replicates (n = 3) of clinical samples (n = 7). Box plots show the interquartile range. o TME subtype probability for varying input amounts (10, 20, 50 ng). Error bars: s.e.m. p Heatmap of gene signature scores from Bagaev et al. across all input amounts in reproducibility experiments for predicted TME subtypes.