Fig. 2: Visualization of the model training process.

The black curve represents the change in training loss, the red curve represents the change in validation loss, and the blue curve represents the change in knowledge distillation loss.
The black curve represents the change in training loss, the red curve represents the change in validation loss, and the blue curve represents the change in knowledge distillation loss.