Table 5 Model performance against independent glaucoma specialist reviewers

From: Clinically informed semi-supervised learning improves disease annotation and equity from electronic health records: a glaucoma case study

 

Samples

Accuracy

F1 score

AUROC

AUCPR

Kappa (%)

Agreement (%)

R1

76

0.842

0.760

0.948

0.809

78.801

84.211

R2

76

0.75

0.782

0.936

0.851

64.053

78.899

R3

66

0.591

0.624

0.814

0.650

43.357

73.291

  1. Metrics include accuracy, macro F1-score, AUROC, AUCPR, Cohen’s kappa (%), and agreement (%). R1-R3: fellowship-trained reviewers. Bold values indicate the highest performance for each evaluation metric across the independent glaucoma specialist reviewers (R1-R3).