Extended Data Fig. 8: Low-purity RNA subtype association with low-purity metrics.

The low-purity RNA subtype was defined based on an association of the samples in this category with multiple independent measures of sample purity. (a) An index associated with genes expressed in non-B-cell tissues was used to identify samples with contamination of non-B lineage cells in the CD138+-enriched cell fractions (n = 714). (b) Tumor purity was estimated from the exome copy number or mutation data based on the absolute allele frequency of constitutional variants in deletion regions or somatic SNV allele frequency in diploid regions of the genome when no usable deletions were detected in the tumor (n = 593). The range of estimated contamination (a) and purity (b) is shown as a boxplot with the upper and lower bounds of the box representing the 25th and 75th percentile, while the center line indicates the median and whiskers represent the highest and lowest value within 1.5 (IQR). (c) The full distribution of observed somatic SNV allele frequencies (n = 593) is shown as a violin plot, where the median is indicated by the horizontal line and the population frequency of the value is indicated by the width of the plot.