Table 4 Statistical characteristics of workload datasets used in experiments.
Dataset | Duration (days) | Number of jobs | Avg. GPU utilization (%) | Workload types | Sampling interval |
|---|---|---|---|---|---|
Google cluster 2019 | 30 | 147,523 | N/A (CPU only) | Batch, service, ML training | 5 min |
Alibaba cluster 2020 | 28 | 68,941 | 62.3 ± 28.7 | GPU training, inference | 1 min |
Academic research cluster | 42 | 3,847 | 48.5 ± 35.2 | LLM, NAS, CV research | 10 s |
Synthetic workload | 14 | 25,000 | 70.1 ± 22.4 | Mixed AI workloads | 30 s |
Combined dataset | 114 | 245,311 | 58.7 ± 30.1 | All categories | Variable |