Table 3 Human evaluation of medgaze predictions compared to Human-Generated scanpaths on CXR images across defined metrics.
From: Modeling radiologists’ cognitive processes using a digital gaze twin to enhance radiology training
Criteria | Rating Scale | Prediction | Ground truth |
|---|---|---|---|
Identifying Machine-Generated vs. Human Gaze Patterns | 0: (Machine-Generated) | 7 | 1 |
1: (Human-Like) | 13 | 19 | |
Comprehensive Scores: Coverage of Important Regions | 1: (00–20%) Very little coverage | 0 | 0 |
2: (21–40%) Some regions covered | 0 | 0 | |
3: (41–60%) Fair amount of coverage | 2 | 0 | |
4: (61–80%) Most regions covered | 8 | 8 | |
5: (81–100%) All regions covered | 10 | 12 | |
Redundancy Score: Coverage of Redundant Regions | 1: Minimal redundancy | 9 | 5 |
2: Some minor redundancy | 7 | 11 | |
3: Moderate redundancy | 3 | 4 | |
4: Significant redundancy | 1 | 0 | |
5: High redundancy and inefficiency | 0 | 0 |