Table 3 Hyperparameter optimization results of the transformer–BiLSTM model.

From: Intelligent prediction of air quality index based on the transformer-BiLSTM model

Parameter

RMSE

MAE

MAPE (%)

\(\mathbf {R^2}\)

Learning rate

 0.1

40.5567

35.4188

8.4397%

− 4.7100

 0.01

34.9032

29.6495

7.1623

− 3.2300

 0.001

5.3475

3.6833

7.7227

0.9007

 0.0001

84.6039

81.3622

18.5172

− 23.8570

Batch size

 8

3.4152

2.5205

5.0662

0.9595

 16

3.1122

1.8026

3.3865

0.9664

 32

5.3475

3.6833

7.7227

0.9007

 64

8.5698

6.2148

10.6895

0.7450

Dropout rate

 0.1

3.7674

2.4779

4.6489

0.9507

 0.2

3.6751

2.3684

4.4654

0.9531

 0.3

4.4978

3.0577

5.7145

0.9298

 0.5

6.0808

3.8354

6.8147

0.8716

Input window size

 1

7.8250

5.1986

9.3191

0.7874

 2

3.1456

1.8657

3.3789

0.9656

 3

3.6751

2.3684

4.4654

0.9531

 5

5.3369

3.8175

7.2976

0.9011

Training epochs

 50

6.0394

4.0313

8.9337

0.8733

 100

3.0012

1.7928

3.3646

0.9687

 150

5.5775

4.0507

8.4381

0.8920

 200

6.5489

5.1587

9.6348

0.8511