npj Digital Medicine

Table 5 Model performance against independent glaucoma specialist reviewers

From: Clinically informed semi-supervised learning improves disease annotation and equity from electronic health records: a glaucoma case study

	Samples	Accuracy	F1 score	AUROC	AUCPR	Kappa (%)	Agreement (%)
R1	76	0.842	0.760	0.948	0.809	78.801	84.211
R2	76	0.75	0.782	0.936	0.851	64.053	78.899
R3	66	0.591	0.624	0.814	0.650	43.357	73.291

Metrics include accuracy, macro F1-score, AUROC, AUCPR, Cohen’s kappa (%), and agreement (%). R1-R3: fellowship-trained reviewers. Bold values indicate the highest performance for each evaluation metric across the independent glaucoma specialist reviewers (R1-R3).

Back to article page

Search

Advanced search

Quick links