Table 5 Detailed dataset specifications for experimental reproducibility.

From: Attention-based workload prediction and dynamic resource allocation for heterogeneous computing environments

Specification

Google cluster 2019

Alibaba cluster 2020

Academic research cluster

Total duration

30 days

28 days

42 days

Total time steps

8640

40,320

362,880

Number of jobs

147,523

68,941

3847

Sampling interval

5 min

1 min

10 s

Input window (T)

128 steps

128 steps

128 steps

Prediction horizon (H)

32 steps

32 steps

32 steps

Training set

Days 1–21 (70%)

Days 1–19 (70%)

Days 1–29 (70%)

Validation set

Days 22–25 (15%)

Days 20–23 (15%)

Days 30–35 (15%)

Test set

Days 26–30 (15%)

Days 24–28 (15%)

Days 36–42 (15%)

Training samples

4,032

27,216

243,936

Validation samples

864

5,832

52,272

Test samples

864

5,832

52,272

Feature dimensions

12

18

24

Workload types (N)

8

12

6