Table 3 Hyperparameter analysis of proposed model.

From: Global attention and local features using deep perceptron ensemble with vision Transformers for landscape design detection

Parameter

Values

Batch Size

32

Number of Layers

32

Hidden Layer Units

1024 (L)

Optimizer

AdamW

Loss Function

Cross-Entropy

Epsilon

1.00E-06

Dropout Rate

0.1

Epochs

100 (fine-tuning)

Learning Rate

0.0001

Beta 1 & Beta 2

0.9 & 0.999

Weight Decay

0.05–0.2

Gradient Clipping

1.0

Dropout in FC Layers

0.1

L2 Regularization

0.0001

Early Stopping

Patience = 5 epochs

Brightness Range

0.2–0.4

Number of Parameters

300 M (L)

Hidden Layer Activation

GELU

Final Layer Activation

SoftMax

Gradient Descent Type

Mini-Batch Adaptive Gradient Descent