Table 6 Class-wise performance of best models in each method.

From: Development of approach to an automated acquisition of static street view images using transformer architecture for analysis of Building characteristics

Task

Method

Metrics

Class

Usable

Potential

Non-usable

Average

Whole building façade

Swin Transformer

F1 score

91.08

90.47

89.12

90.22

Accuracy

92.85

91.92

90.57

91.78

ViT

F1 score

89.82

89.03

87.93

88.93

Accuracy

91.67

90.76

89.32

90.58

PVT

F1 score

89.27

88.52

87.25

88.35

Accuracy

91.02

90.22

88.73

89.99

MobileViT

F1 score

88.48

87.73

86.07

87.43

Accuracy

90.22

89.27

87.88

89.12

Axial Transformer

F1 score

87.98

87.22

85.68

86.96

Accuracy

89.78

88.88

87.42

88.69

First story

Swin Transformer

F1 score

90.67

90.02

88.53

89.74

Accuracy

93.87

92.48

90.52

92.29

ViT

F1 score

89.27

88.53

87.08

88.29

Accuracy

92.53

91.12

89.12

90.92

PVT

F1 score

88.72

87.88

86.47

87.69

Accuracy

92.03

90.72

88.92

90.56

MobileViT

F1 score

87.83

86.97

85.43

86.74

Accuracy

91.23

89.82

87.87

89.64

Axial Transformer

F1 score

87.18

86.32

84.87

86.12

Accuracy

90.83

89.57

87.37

89.26