Fig. 2: Numerical results on ResNet as a function of step (Each step corresponds to a step of stochastic gradient descent based on the derivatives of the loss computed from 2048 randomly selected training samples).
From: Towards provably efficient quantum algorithms for large-scale machine-learning models

a ResNet Hessian spectra during training. b Estimated error proxy during training. c Training accuracy evolution for ResNet.