Table 2 Comparison between benchmark algorithms and PIDAO-family algorithms on various deep learning cases

From: Accelerated optimization in deep learning with a proportional-integral-derivative controller

Method

MNIST

Fashion MNIST

CIFAR-10

 

Train Loss

Test Loss

Train Acc

Test Acc

Train Loss

Test Loss

Train Acc

Test Acc

Train Loss

Test Loss

Train Acc

Test Acc

PIDAO (SI)

0.00186

0.04841

100.00%

98.54%

0.15095

0.20203

94.37%

92.89%

0.00205

0.33433

99.97%

93.1825%

PIDAO (ST)

0.00186

0.04831

100.00%

98.59%

0.15286

0.20362

94.25%

92.91%

0.00174

0.33653

99.97%

93.06%

PIDAO (AdSI)

0.00439

0.05794

99.92%

98.42%

0.14573

0.20131

94.58%

92.96%

0.04734

0.37667

98.50%

91.51%

Momentum

0.00201

0.04931

100.00%

98.47%

0.15704

0.20676

94.19%

92.61%

0.01220

0.39213

99.58%

92.25%

Adam

0.00781

0.06894

99.78%

98.20%

0.15615

0.20618

94.20%

92.61%

0.08802

0.34870

96.94%

91.00%

PIDopt

0.00508

0.05739

99.90%

98.29%

0.16225

0.20776

93.99%

92.51%

0.01299

0.40980

99.57%

91.79%

AdaHB

0.00813

0.07174

99.77%

98.09%

0.16418

0.20324

93.93%

92.70%

0.10775

0.37442

96.23%

90.10%

Method

PINNs for 1D Burgers’ equation

PINNs for 2D cavity flow

FNO for 1D Burgers’ equation

FNO for 2D Darcy equation

 

Train Loss

Test Error of u(x, t)

Train Loss

Test Error of u(x, y)

Test Error of v(x, y)

Train Loss

Test L2 Error of u(x, t)

Train Loss

Test L2 Error of u(x1, x2)

PIDAO (AdSI)

5.8440 × 10−4

2.87%

6.3204 × 10−5

1.30%

1.76%

3.4827 × 10−4

5.3612 × 10−4

1.1918 × 10−2

5.4192 × 10−3

RMSprop

9.2526 × 10−3

12.20%

8.9772 × 10−3

34.17%

68.80%

1.4479 × 10−3

2.0741 × 10−3

1.7722 × 10−2

1.2003 × 10−2

Adam

5.5540 × 10−4

6.58%

1.8461 × 10−4

3.07%

4.22%

4.6665 × 10−4

7.1373 × 10−4

1.0586 × 10−2

7.7934 × 10−3

AdamW

3.4320 × 10−4

6.23%

1.4487 × 10−4

2.51%

3.89%

7.0761 × 10−4

1.2057 × 10−3

5.8096 × 10−3

8.2432 × 10−3

AdaHB

3.8605 × 10−3

9.43%

3.1194 × 10−3

21.07%

43.73%

9.5934 × 10−4

1.7213 × 10−3

1.9964 × 10−2

1.6065 × 10−2

  1. The best results are marked in bold. The hyperparameter settings of these optimizers can be found in SI Section H.