Fig. 2: Performances of the supervised classifier for prediction of COVID-19 severity. | Communications Biology

Fig. 2: Performances of the supervised classifier for prediction of COVID-19 severity.

From: An explainable model of host genetic interactions linked to COVID-19 severity

Fig. 2

a Distribution of performance metrics of different algorithms during testing on the five folds. The horizontal line inside each box represents the median value, and the height (whiskers) of each of the boxes depict the standard error (variability) of a particular performance metrics under consideration as scored across the five fold CVs by the employed supervised ML algorithms. The dotted points above and below the individual box-and-whisker lines are potential outliers that are above or below the 25th percentile, and the 75th percentile; b feature importance distribution for features with non-zero importance across the five folds. The characteristics of each box-plot are as in Fig. 2a; c log-odds ratio of the 16 variants with full support in XGBoost trained models; d performances of the predictors with 16 variants plus covariates (age and gender; orange), only co-variates (green), all screened variants plus covariates (blue) in the held-out test set (samples n = 168); e performances of the predictors with 16 variants plus covariates (age and gender; orange), only co-variates (green),all screened variants plus covariates (blue) in a follow-up testing set cohort (new samples n = 618).

Back to article page