Extended Data Table 2 Hyperparameter selection in Continual ImageNet

From: Loss of plasticity in deep continual learning

  1. Values used for the grid searches to find the best set of hyperparameters for all algorithms tested on Continual ImageNet. The best-performing set of values for each algorithm is in bold. The values in the third column for L2 regularization and Shrink and Perturb correspond to the weight decay, whereas for continual backpropagation, they correspond to the replacement rate.