Fig. 2: Auditing three paired-input applications of therapeutic interest. | Communications Biology

Fig. 2: Auditing three paired-input applications of therapeutic interest.

From: Systematic auditing is essential to debiasing machine learning in biology

Fig. 2

a The performance of the best performing PPI classifier, F1, on dataset D1 for benchmarking and the four auditors. In the Feature Auditor, protein features are masked; yet, classifier performance is retained. In the Node-degree and Recurrence Auditors, proteins are represented solely by their node degrees in the positive or negative training dataset or by their differential node degree between the positive and negative training examples; yet, without protein features, classifier performance informed by node degree or protein recurrence alone is retained. b The performance of the best performing PPI classifier, F5, on dataset D2, with similar observations as with Fig. 2a. c The performance of the optimized drug-target predictor F8 on datasets D4 and D5 in the benchmarking (module 1), generalization (module 2), and bias identification (module 3, Feature Auditor). d The reported (module 1) and generalization (module 2) performances of MHC-peptide binding predictors F13–F20. Asterisks denote the two paired-input predictors, F17 and F20; the other predictors are single-input.

Back to article page