Table 8 Swin transformer hierarchical architecture.

From: Attention-Enhanced CNNs and transformers for accurate monkeypox and skin disease detection

Stage

Transformer Blocks

Feature Dimension

Patch Resolution/Down sampling

Patch

Patch Embedding (Conv)

96

Initial patch size (4 × 4 patches)

Stage 1

Swin Transformer Blocks (W-MSA & SW-MSA)

96

No down sampling

Stage 2

Swin Transformer Blocks

192

Down sampled by 2

Stage 3

Swin Transformer Blocks

384

Down sampled by 2

Stage 4

Swin Transformer Blocks

768

Down sampled by 2

Head

Global Average Pooling/MLP

-

-