Scientific Reports

Table 2 Detail architecture for used vision transformer. Trainable params: 85,801,732.

From: Residual self-attention vision transformer for detecting acquired vitelliform lesions and age-related macular drusen

Layer name	Number of layers	Input shape	Output shape	Number of params
PatchEmbed	1	[1, 3, 224, 224]	[1, 196, 768]	590 592
Dropout	1	[1, 197, 768]	[1, 197, 768]	–
Identity	2	[1, 197, 768]	[1, 197, 768]	–
Encoder Block	12	[1, 197, 768]	[1, 197, 768]	85 209 604
LayerNorm	1	[1, 197, 768]	[1, 197, 768]	1 536
Identity	1	[1, 768]	[1, 768]	–
Dropout	1	[1, 768]	[1, 768]	–
Linear	1	[1, 768]	[1, 4]	3 076

Back to article page

Search

Advanced search

Quick links