Supplementary Figure 14: Validation of treeClust and t-SNE on an independent proteomics dataset. | Nature Biotechnology

Supplementary Figure 14: Validation of treeClust and t-SNE on an independent proteomics dataset.

From: Co-regulation map of the human proteome enables identification of protein functions

Supplementary Figure 14

(a) treeClust was applied to the TMT-based cancer proteomics dataset from Lapek et al (Nature Biotechnology, 2017). It outperforms Pearson, Spearman and Bicor correlation, as shown by a Precision-Recall analysis using Reactome annotations as the gold standard. Note that treeClust builds only one decision tree per condition, i.e. 41 trees on this dataset, too few for a standard analysis. Therefore, treeClust was performed iteratively, obtaining the mean co-regulation score of 100 treeClust forests, each generated from 10 random experiments. (b) Co-regulation map for the Lapek et al dataset, made by t-SNE from treeClust scores. As in the correlation network of the original report (Fig. 2 in Lapek et al), CORUM protein complexes are colored. In contrast to a network, there is not a limited number of arbitrarily arranged, pairwise links, but the position of each protein reflects its similarity or dissimilarity to all other proteins in the map. This makes it possible to place all proteins in a functional context, not just those that are directly linked to members of the core network. It also allows for a hierarchical analysis of protein associations, with increasing distances indicating weaker co-regulation. For example, the subunits of the protein complexes in the enlarged map area (inset) are clustered together, and the distances between the complexes are larger. However, all complexes have roles in vesicular trafficking. n = 6,151 proteins shown in plot.

Back to article page