Table 2 Average value for the average Log Norm and Weighted Alpha metrics for pretrained OpenAI GPT and GPT2 models.
Series | # | \(\langle {\mathrm{log}}\,\parallel {\bf{W}}{\parallel }_{F}\rangle \) | \(\langle {\mathrm{log}}\,\parallel {\bf{W}}{\parallel }_{\infty }\rangle \) | \(\hat{\alpha }\) | \(\langle {\mathrm{log}}\,\parallel {\bf{X}}{\parallel }_{\alpha }^{\alpha }\rangle \) |
---|---|---|---|---|---|
GPT | 49 | 1.64 | 1.72 | 7.01 | 7.28 |
GPT2-small | 49 | 2.04 | 2.54 | 9.62 | 9.87 |
GPT2-medium | 98 | 2.08 | 2.58 | 9.74 | 10.01 |
GPT2-large | 146 | 1.85 | 1.99 | 7.67 | 7.94 |
GPT2-xl | 194 | 1.86 | 1.92 | 7.17 | 7.51 |