Table 1 Summary of reviewed literature with a focus on dataset split and reported test classification performance.

From: Inflation of test accuracy due to data leakage in deep learning-based classification of OCT images

Ref.

OCT dataset

Data split strategy

Model performance on testing set

9

Thyroid, parathyroid, fat and muscle samples

per-image

97.12% accuracy

35

Pituitary adenoma

per-image

0.96 AUC

18

Ophthalmology15* (version 2)

original split

95.55% accuracy

19

Ophthalmology15* (version 2)

original split

99.1% accuracy

21

Ophthalmology15* (version 3)

original split

98.7% accuracy

22

Ophthalmology15* (version 3)

original split

96.6% accuracy

27

Ophthalmology15* (train version 2, test version 3)

original split

99.6% accuracy

23

(1) Ophthalmology15* (train version 2, test version 3) (2) Ophthalmology16*

(1) original split

(2) per-volume/subject

(1) 99.80% accuracy

(2) 100% accuracy

36

Coronary artery

per-volume/subject

96.05% accuracy

37

Kidney

per-volume/subject

82.6% accuracy

38

High and low grade brain tumors

per-volume/subject

97% accuracy

39

Colon**

per-volume/subject

88.95% accuracy on 2D images

40

Breast tissue

per-volume/subject

91.7% specificity

20

Ophthalmology15* (version 2)

per-volume/subject

98.46% accuracy

10

(1) Ophthalmology15* (version 3)

(2) Ophthalmology16*

(3) Breast tissue14*

(1) original split (2) per-volume/subject (3) per-image

(1) ~96% accuracy

(2) >98.8% accuracy

(3) 98.8% accuracy

24

(1) Ophthalmology16*

(2) Ophthalmology41*

(3) Ophthalmology42*

(4) Ophthalmology15* (unclear version)

(1) per-volume/subject

(2) per-volume/subject

(3) per-volume/subject

(4) original split

(1) 96.66% accuracy

(2) 98.97% accuracy

(3) 99.74% accuracy

(4) 99.78% accuracy

43

Dentistry

No description given

98% sensitivity 100% specificity

44

Ophthalmology

No description given

99.19% accuracy

  1. Open-access datasets and the ones available upon request are marked by * and **, respectively. The dataset is not open-access if not specified. Datasets obtained from animal model samples are marked by . The difference in performance between studies using the same datasets results from the different methods implemented.