Extended Data Fig. 2: Overview of copy number features and signature identification. | Nature

Extended Data Fig. 2: Overview of copy number features and signature identification.

From: A pan-cancer compendium of chromosomal instability

Extended Data Fig. 2

a, A schematic showing the 5 fundamental copy number features that were computed using 6,335 samples with detectable CIN (dCIN). Note, a feature capturing absolute copy number is not included in our method. b, A schematic showing how mixture modelling is used to split the genome-wide feature distributions into smaller components by either Variational Bayes Gaussian mixture models or Finite Poisson mixture models. The actual number of resulting components is listed below each feature distribution. These components represent basic building blocks of each feature distribution. c, An example of how the probability of a CNA belonging to a mixture component (posterior probability) is calculated and how these are summed. d, (Right) The resulting 43-dimensional feature vectors for each sample, after all posterior probabilities are summed for each component. (Left) A schematic of how the sum-of-posterior matrix for all 6,335 samples was split in two matrices by a Bayesian implementation of the non-negative matrix factorisation (NMF), resulting in a signature catalogue and an activity catalogue.

Back to article page