Table 2 Performance of three deep learning algorithms in the internal test dataset.
From: Preventing corneal blindness caused by keratitis using artificial intelligence
One-vs.-rest classification | NEH internal test dataset | ||
---|---|---|---|
Sensitivity (95% CI) | Specificity (95% CI) | Accuracy (95% CI) | |
Keratitis vs. others + normal | |||
DenseNet121 | 97.7% (96.4–99.1) | 98.2% (97.1–99.4) | 98.0% (97.1–98.9) |
Inception-v3 | 95.0% (93.1–97.0) | 98.4% (97.3–99.5) | 96.8% (95.6–97.9) |
ResNet50 | 96.7% (95.1–98.3) | 95.0% (93.1–96.9) | 95.8% (94.6–97.1) |
Others vs. keratitis + normal | |||
DenseNet121 | 94.6% (90.7–98.5) | 98.4% (97.5–99.2) | 97.9% (97.0–98.8) |
Inception-v3 | 93.1% (88.7–97.4) | 97.2% (96.1–98.3) | 96.7% (95.5–97.8) |
ResNet50 | 81.5% (74.9–88.2) | 97.5% (96.5–98.6) | 95.4% (94.1–96.7) |
Normal vs. keratitis + others | |||
DenseNet121 | 98.4% (97.1–99.7) | 99.8% (99.5–100) | 99.3% (98.8–99.8) |
Inception-v3 | 98.7% (97.5–99.8) | 99.0% (98.2–99.8) | 98.9% (98.2–99.5) |
ResNet50 | 97.1% (95.3–98.8) | 99.2% (98.5–99.9) | 98.4% (97.6–99.2) |