Table 2 Adam optimization algorithm.

From: Logistics demand prediction using fuzzy support vector regression machine based on Adam optimization

Input: sample set \(\left({x}_{i},{y}_{i}\right),\,\left({x}_{i},{y}_{i}\right)\in {R}^{n}\times R\) \(\left(i=\mathrm{1,2},...,m\right),\) penalty parameter C = 100, maximum number of iterations \(T=50000\).

Initialization: Let \(t=0\), \(b=0\), \({m}_{t}={v}_{t}=0\), \({w}\) is equal to the \(m* m\) dimensional matrix of all ones.

Iteration:

For \(t=\mathrm{1,2},\,\cdot \cdot \cdot\,,T\) do

â‘ Calculate the gradient of the loss function with respect to the weight vector \(w\) and the bias \(b\):

\({g}_{w,t}={\nabla }_{w}f\left({w}_{t-1},{b}_{t-1}\right){,g}_{b,t}={\nabla }_{b}f\left({w}_{t-1},{b}_{t-1}\right)\)

â‘¡Calculate the first-order moment of the gradient:

\({m}_{w,t}={\beta }_{1}\cdot {m}_{w,t-1}+\left(1-{\beta }_{1}\right)\cdot {g}_{w,t},{m}_{b,t}={\beta }_{1}\cdot {m}_{b,t-1}+\left(1-{\beta }_{1}\right)\cdot {g}_{b,t}\)

â‘¢Calculate the second-order distance of the gradient:

\({v}_{w,t}={\beta }_{2}\cdot {v}_{w,t-1}+\left(1-{\beta }_{2}\right)\cdot {g}_{w,t}^{2},{v}_{b,t}={\beta }_{2}\cdot {v}_{b,t-1}+\left(1-{\beta }_{2}\right)\cdot {g}_{b,t}^{2}\)

â‘£Correcting the first order moment \({m}_{w,t},{m}_{b,t}:{\hat{m}}_{w,t}=\frac{{m}_{w,t}}{1-{\beta }_{1}^{t}},{\hat{m}}_{b,t}=\frac{{m}_{b,t}}{1-{\beta }_{1}^{t}}\)

⑤Correcting the second-order moment \({v}_{w,t},{v}_{b,t}:{\hat{v}}_{w,t}=\frac{{v}_{w,t}}{1-{\beta }_{2}^{t}},{\hat{v}}_{b,t}=\frac{{v}_{b,t}}{1-{\beta }_{2}^{t}}\)

â‘¥Update the weight vector w and the bias b with the following formula:

\({w}_{t}={w}_{t-1}-\rho \cdot \frac{{\hat{m}}_{w,t}}{\left(\sqrt{{\hat{v}}_{w,t}}+\xi \right)},{b}_{t}={w}_{t-1}-\rho \cdot {\hat{m}}_{b,t}/\left(\sqrt{{\hat{v}}_{b,t}}+\xi \right)\)

⑦Calculate \({f}_{t}=f\left({w}_{t},{b}_{t}\right)\) and \({f}_{t+1}=f\left({w}_{t+1},{b}_{t+1}\right)\) according to formula(2.12)

If \(\Vert {f}_{t+1}-{f}_{t}\Vert < \varepsilon\)

end

\({\rm{else}}\) \(t=t+1\), turn to â‘ 

end for

Return \({w}^{* }={w}_{t+1}\), \({b}^{* }={b}_{t+1}\)

Output: the optimal weight vector \({w}^{* }\) and optimal bias \({b}^{* }\) and the decision function \(f\left(x\right)={w}^{* }x+{b}^{* }.\)