Extended Data Fig. 8: Machine learning workflows for the integration of multiple CF–MS replicates.

a, Schematic overview of cross-validation approaches for CF–MS data. b, Comparison of cross-validation by protein pairs or individual proteins in network inference from two to four CF–MS experiments using a naive Bayes classifier, with AUCs calculated in cross-validation or in an independent set of held-out protein complexes. c, Impact of feature selection on network inference from two to four CF–MS experiments, comparing between one and six top-performing features, an equivalent number of random features, or five features computed in PrInCE. d, Comparison of top-performing or random features in network inference from two to ten CF–MS experiments, using between one and ten top-performing features. e, Comparison of network inference with features calculated from concatenated matrices of two to four CF–MS experiments, or with features calculated from individual experiments. f, Comparison of network inference from two to four CF–MS experiments using a naive Bayes classifier before and after median imputation of missing values. g, Impact of the number of top-performing or random features provided as input on network inference from two to ten CF–MS experiments. h, Comparison of random forest and naive Bayes classifiers in network inference from two to ten CF–MS replicates, using between one and ten features. i, Network inference from human CF–MS data when integrating varying proportions of SEC and IEX experiments. The total number of CF–MS datasets is shown above the plots, and the number of SEC datasets is shown on the x-axis.