Table 5 Hyperparameter details.
From: Advanced air quality prediction using multimodal data and dynamic modeling techniques
Hyperparameter | Description | Values |
|---|---|---|
Learning rate | Controls weight update size during training | 0.001 |
Batch size | Number of samples processed before weight update | 32 |
Epochs | Number of complete passes through the dataset | 50, 100, 150 |
Optimizer | Algorithm for updating model weights | Adam, SGD, RMSprop |
Dropout Rate | The fraction of neurons randomly dropped during training | 0.2 |
Number of Layers | Number of layers in the model | 3 |
Neurons per Layer | Number of units in each layer | 128 |
Activation Function | Non-linear function for the model | ReLU, Tanh, Sigmoid |
Learning Rate Decay | Reduction of learning rate during training | 0.9 |
Momentum | Used to accelerate convergence in SGD | 0.8 |
Attention Heads | Number of attention heads in attention mechanisms | 2 |
Hidden State Size (BiLSTM) | Number of units in the BiLSTM hidden layer | 128 |
Kernel Size (CNN) | Size of convolution filters | (3 × 3) |
Stride (CNN) | Step size for convolution operations | 1, 2, 3 |
Loss Function | Function to compute prediction error | MSE, MAE |
Scheduler | Adjusts the learning rate during training | Exponential Step decay |