Table 4 Comparing the state-of-the-art retinal diseases recognition methods based on ViT architecture. It should be emphasized that all presented percentages values are averages.

From: Residual self-attention vision transformer for detecting acquired vitelliform lesions and age-related macular drusen

Model

Number of classes

Dataset

Dataset size

Accuracy (%)

Precision (%)

Recall (%)

F1-score (%)

Specificity (%)

Study

ViT

4

OCT46

35,432

95.14

95.24

97.72

96.42

n/d

11

T2T-ViT

96.01

96.15

98.34

97.23

n/d

Mobile ViT

99.17

99.59

99.59

99.58

n/d

MedViT

3

NEH-v225

16,822

99.70

n/d

n/d

n/d

93.05

25

4

UCSD34

108,312

84.10

n/d

n/d

n/d

98.65

ViT

4

OCT2017

84,484

97.50

97.90

n/d

97.30

96.02

47

SViT

99.90

100

n/d

99.95

97.29

MedViT

4

RetinaMNIST

1,600

97.47

n/d

n/d

n/d

n/d

48

HCTNet

4

OCT2017

84,484

91.56

99.60

98.02

98.60

96.55

49

Conv-ViT

4

OCT2017

84,484

92.37

94.00

94.00

94.00

94.00

31

ViT

4

OCT2017

84,484

99.06

    

50

SwinT

4

OCT2017

84,484

98.01

99.07

99.07

99.07

n/d

51

LLCT

98.70

97.83

97.65

99.23

n/d

SwinPolyT

99.82

99.80

99.80

99.80

n/d

0.3cmFD-CNN

3

Duke

3231

97.19

96.37

97.34

96.75

98.73

19

4

UCSD

84,484

99.40

99.40

99.41

99.40

99.41

D-KNN

3

Duke

3231

96.88

95.90

97.10

96.38

98.62

19

4

UCSD

84,484

99.50

99.51

99.50

99.50

99.83

D-SVM

3

Duke

3231

97.50

96.61

97.64

97.03

98.91

19

4

UCSD

84,484

99.60

99.60

99.60

99.60

99.87

FT-CNN

3

Duke

3231

99.69

99.76

99.70

99.73

99.90

20

4

UCSD

84,484

99.70

99.70

99.90

99.70

99.90

R-FTCNN

3

Duke

3231

99.06

98.81

98.91

98.86

99.55

15

4

UCSD

85,484

99.60

99.60

99.60

99.60

99.87

RS-A ViT

3

RS-A ViT

1280

96.92

96.06

96.02

96.02

98.32

this study

  1. Significant values are in bold.