Table 3 Hyperparameter configurations used for training.

Parameter	Value
Training epochs	50
Batch size	128
Optimizer	Adam
Learning rate	0.001 (adaptive)
Loss function	Cross-Entropy
Dropout rate	0.2
Hidden layers	3 dilated RNN layers
Hidden units/layer	128 (tuned by MSGO)
Attention mechanism	Softmax over the temporal axis

Quick links

Search