Fig. 2: Machine-learning classifier properties. | npj Systems Biology and Applications

Fig. 2: Machine-learning classifier properties.

From: Prediction of hemophilia A severity using a small-input machine-learning framework

Fig. 2

a Comparison of the accuracy of the different classifiers. These are the averages of 10-fold cross-validation for each classifier. The bars depict mean values and error bars, the standard deviation. b The AUC (Area Under the Receiver Operating Characteristic curve depicts the relation between the true positive and the false-positive rates. Points close to (0,1) indicate a better classification. The diagonal line represents a random classifier for a class-balanced dataset, i.e., any result below this line is worse than assigning labels randomly. c Distribution of the Severity Scores of two classifiers and an ensemble, and their relation to the in vitro chromogenic activity and the expression/secretion ability of the in vitro FVIII mutant constructs. In total, 344 alanine mutants were used (205 for A2 and 139 for C2). The boxplots depict the median (centerline), the first and third quartiles (lower- and upper-bounds), and 1.5 times the inter-quartile range (lower- and upper whiskers). Each dot in the plot is an amino acid mutation (i.e., an in vitro alanine mutant construct). d The Severity Score prediction of two classifiers for the chromogenic activity of FVIII mutants. The lack of correlation indicates that the classifiers are assigning different probabilities for the same instance—namely, having a different perspective about the real classification of mutants; this observation led us to combine their prediction values to come closer to the real activity of FVIII mutants. In all cases, we used the unpaired, two-sided Wilcoxon test (*** indicate p values < 0.001; **p value < 0.01; *p value < 0.05). SVM support vector machine, DT decision tree, NB naïve Bayes.

Back to article page