Table 6 Comparative forecasting performance of different models on the MyPJ dataset under the optimal mini-batch size configuration.

From: Mini-batch size sensitivity in deep residual networks for short-term load forecasting: an empirical study

Model

MAPE

MAE

MSE

RMSE

NMSE

R

R2

CNN

0.053767

0.026603

0.002291

0.047865

0.080470

0.959310

0.919530

TCN

0.058103

0.030002

0.002284

0.047789

0.080217

0.959847

0.919783

LSTM

0.054929

0.029391

0.002264

0.047583

0.079524

0.961677

0.920476

Transformer

0.059608

0.028453

0.002584

0.050832

0.090756

0.956054

0.909244

Transformer-LSTM

0.060019

0.032521

0.002564

0.050632

0.090045

0.959019

0.909955

CNN-LSTM-MMA

0.054391

0.028353

0.002105

0.045883

0.073944

0.963565

0.926056

DRN

0.051082

0.025190

0.002212

0.047033

0.077698

0.960669

0.922302

PCA-DRN

0.048943

0.024916

0.001830

0.042777

0.064271

0.968054

0.935729