Table 1 Overall assessment of various scientific machine learning benchmarking approaches

Benchmark	Focus			Process			Challenges
Benchmark	Scientific	Application	System	Metrics	Framework	Reporting	Data	Distribution	Coverage	Extensibility
Deep500	None	None	Partial	Full	Full	Partial	None	None	None	Partial
RLBench	None	Partial	Partial	Full	None	Partial	Partial	Partial	Partial	Partial
CORAL-2 (DLS/BDAS)	Partial	Full	Full	Full	Partial	Partial	None	None	Full	None
AIBench + HPC AI500	Full	Full	Full	Full	None	Full	Partial	Partial	Partial	Partial
DAWNBench	None	Full	Full	Full	None	Partial	None	None	None	None
MLCommons Science	Full	Full	Partial	Full	None	Partial	Partial	Partial	Full	Partial
SciMLBench	Full	Full	Full	Full	Full	Partial	Full	Full	Full	Full
Community competitions	Partial	None	None	Partial	None	Partial	Partial	None	Partial	None

In qualitatively assessing how far each approach addresses the concerns, we have indicated whether they offer no support (none), partial or questionable support (partial) or fully support the concern (full).

Quick links

Search