Fig. 2: Using the cross-validation scheme from “Cross-validation setup” for binary classification on unseen samples, the accuracies for each of the 400 runs are calculated and the distribution of the results is shown.

Each run is a 10-fold cross-validation, and mean accuracy and standard deviation are built up from this distribution. The mentioned mean accuracy here is 0.1 % higher than in the main text. This is due to the random choice of different cross-validations and the resulting minimal differences in the distribution if the evaluation scheme is repeated.