Table 3 Comparison of linear regression fits for different average Log Norm and Weighted Alpha metrics across 5 CV datasets, 17 architectures, covering 108 (out of over 400) different pretrained DNNs.

From: Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data

 

\({\mathrm{log}}\,\parallel \cdot {\parallel }_{F}^{2}\)

\({\mathrm{log}}\,\parallel \cdot {\parallel }_{\infty }^{2}\)

\(\hat{\alpha }\)

\({\mathrm{log}}\,\parallel \cdot {\parallel }_{\alpha }^{\alpha }\)

RMSE (mean)

4.84

5.57

4.58

4.55

RMSE (std)

9.14

9.16

9.16

9.17

R2 (mean)

3.9

3.85

3.89

3.89

R2 (std)

9.34

9.36

9.34

9.34

Kendal-tau (mean)

3.84

3.77

3.86

3.85

Kendal-tau (std)

9.37

9.4

9.36

9.36

  1. We include regressions only for architectures with five or more data points, and which are positively correlated with test error. These results can be readily reproduced using the Google Colab notebooks. (See Supplementary Note 2 for details.).