Table 3 Comparison of the two models in terms of precision, recall, and F1 score values; best values in bold.

From: A hybrid ResNet50-vision transformer model with an attention mechanism for aerial image classification

Model

Class

Precision

Recall

F1-score

1

Building

0.93

0.96

0.95

2

0.95

0.96

0.95

1

Car

0.96

0.98

0.97

2

0.98

0.96

0.97

1

Debris

0.96

0.94

0.95

2

0.97

0.92

0.94

1

Ftpath

0.95

0.93

0.94

2

0.94

0.95

0.94

1

Mroad

0.96

0.97

0.97

2

0.93

0.98

0.95

1

Ofield

0.96

0.94

0.95

2

0.97

0.92

0.94

1

Roof

0.96

0.98

0.97

2

0.98

0.98

0.98

1

Shadow

0.95

0.94

0.95

2

0.95

0.92

0.94

1

Tank

0.98

0.98

0.98

2

0.95

0.98

0.97

1

Tree

0.96

0.96

0.96

2

0.94

0.99

0.96

  1. *Model No. 1 and 2 indicate the first hybrid model and the second hybrid model, respectively
  2. Significant values are in bold