Table 2 Accuracy, confidence, and utility scores stratified by recommendation correctness.

From: How machine-learning recommendations influence clinician treatment selections: the example of antidepressant selection

Recommendation correctness

Accuracy

Confidence

Perceived utility

 

M (95% CI)

M (95% CI)

M (95% CI)

Baseline

0.357 (0.333–0.381)

p < 0.0001*

3.67 (3.63–3.72)

p = 0.133

N/A

p = 0.0006*

Correct

0.384 (0.365–0.403)

3.65 (3.62–3.69)

3.52 (3.47–3.56)

Incorrect

0.299 (0.275–0.322)

3.62 (3.57–3.69)

3.40 (3.32–3.47)

  1. p-values measured using repeated-measures ANOVA with a significance level of 0.05.