Extended Data Fig. 3: NMF decomposition of DHS index. | Nature

Extended Data Fig. 3: NMF decomposition of DHS index.

From: Index and biological spectrum of human DNase I hypersensitive sites

Extended Data Fig. 3

a, Schematic of non-negative matrix factorization (NMF) applied to an n-by-m matrix resulting in k components. The objective is to minimize the difference between the original matrix (V) and the product of (W) and (H), such that all elements of (W) and (H) are non-negative. b, Depiction of NMF applied to our DNase-seq dataset of 733 biosample datasets and 3.5M+ DHSs, using k components. c, Colour-based view from the values shown in b. Colours indicate relative loadings of each NMF component, for both biosamples and DHSs. d, Two-dimensional UMAP projection of 733 biosamples coloured by their strongest representative NMF component. e, Choice of NMF decision boundary (0.35) based on maximal F1 score as a function of number of components k (4 to 36). f, F1 score as a function of the number of components k, with the chosen k = 16 and corresponding F1 score indicated. g, Gradient showing reduced gain in F1 score after k = 16.

Back to article page