Fig. 7
From: Time series transformer for tourism demand forecasting

In the encoder–decoder Attention calculation of \(\:De{c}_{S5}\), \(\:En{c}_{O1}\) to \(\:En{c}_{O4}\) are attended, and decoder memory masking prevents \(\:En{c}_{O5}\) to \(\:En{c}_{O7}\) from being attended.