Extended Data Fig. 3: Performance on individual data sets when trained on 80% data. | Nature Biotechnology

Extended Data Fig. 3: Performance on individual data sets when trained on 80% data.

From: Learning protein fitness models from evolutionary and assay-labeled data

Extended Data Fig. 3

A breakdown of averaged Spearman correlation results presented in the right-side mini-panel in Fig. 2a, on 80-20 splits, by individual data set. See Supplementary Fig. 2 for the analogous plot using NDCG. Error bars indicate bootstrapped 95% confidence interval from 20 random data splits. Box-and-whisker plots show the first and third quartiles as well as median values. The upper and lower whiskers extend from the hinge to the largest or smallest value no further than 1.5 x interquartile range from the hinge.

Back to article page