Table 1 Detailed architecture of each module of segtdformer.

From: Dynamic atrous attention and dual branch context fusion for cross scale Building segmentation in high resolution remote sensing imagery

Framework

Module

Name

Number

Backbone

MiT-B0

patch_size

4

embed_dims

[32, 64, 160, 256]

num_heads

[1, 2, 5, 8]

mlp_ratios

[4, 4, 4, 4]

sr_ratios

[8, 4, 2, 1]

Neck

DAA

dim_in

[32, 64, 160, 256]

dim_out

[32, 64, 160, 256]

TA

in_channels

[32, 64, 160, 256]

out_channels

256

num_outs

4

Decoder_Head

Segformer_head

in_channels

[32, 64,160, 256]

in_index

[0, 1, 2, 3]

feature_strides

[4, 8, 16, 32]

  1. Where ‘mlp_ratios’ represents the size of the hidden layer in the MLP; ‘sr_ratio’ controls the size of the K, V parameter matrix.