Table 3 Comparison of ablation experiments.

From: Multi-scale spatial-temporal transformer for traffic flow prediction

Dataset

Models

Gate embedding

Two-stage spatial attention

Frequency dual-channel attention

MAE

RMSE

MAPE (%)

PeMs03

w/o gate embedding

\(\times\)

\(\checkmark\)

\(\checkmark\)

14.48

25.73

14.24

w/o two-stage spatial attention

\(\checkmark\)

\(\times\)

\(\checkmark\)

14.61

26.08

14.13

w/o frequency dual-channel attention

\(\checkmark\)

\(\checkmark\)

\(\times\)

14.56

25.82

14.16

MSSTFormer

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

14.35

25.01

14.05

PeMs04

w/o gate embedding

\(\times\)

\(\checkmark\)

\(\checkmark\)

18.05

30.11

11.71

w/o two-stage spatial attention

\(\checkmark\)

\(\times\)

\(\checkmark\)

18.18

30.46

11.59

w/o frequency dual-channel attention

\(\checkmark\)

\(\checkmark\)

\(\times\)

18.11

30.21

11.64

MSSTFormer

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

17.93

29.40

11.52

PeMs07

w/o gate embedding

\(\times\)

\(\checkmark\)

\(\checkmark\)

18.92

33.18

8.11

w/o two-stage spatial attention

\(\checkmark\)

\(\times\)

\(\checkmark\)

19.05

33.55

7.98

w/o frequency dual-channel attention

\(\checkmark\)

\(\checkmark\)

\(\times\)

18.96

33.27

8.02

MSSTFormer

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

18.79

32.48

7.89

PeMs08

w/o gate embedding

\(\times\)

\(\checkmark\)

\(\checkmark\)

13.31

24.04

8.92

w/o two-stage spatial attention

\(\checkmark\)

\(\times\)

\(\checkmark\)

13.42

24.33

8.84

w/o frequency dual-channel attention

\(\checkmark\)

\(\checkmark\)

\(\times\)

13.38

24.15

8.89

MSSTFormer

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

13.20

23.31

8.76