Table 1 CSW-S network structure details.

From: An improved transformer-based concrete crack classification method

Structure

Input

Convolution kernel

Passage

Step length

Convolutional token enbedding

24 × 24 × 3

7 × 7

64

54

Structure

Input

depth

number head

dim

reso

split-size

Stage1

3136 × 64

1

2

64

56

1

Stage2

3136 × 64

2

4

128

28

2

Stage3

3136 × 64

2

8

256

14

7

Stage4

3136 × 64

1

16

512

7

7

  1. Dim refers to the dimension of the fully connected network spread into and number head the number of multi-headed attention heads. Each head is responsible for different correlations. The resolution is the size of the picture before it is spread into a vector, and depth is the number of repetitions.