Fig. 9: Schematic diagram of AT.

The top of the picture is the teacher model, and the bottom is the student model. This method extracts attention from the teacher network and uses it as a goal to guide the student network.
The top of the picture is the teacher model, and the bottom is the student model. This method extracts attention from the teacher network and uses it as a goal to guide the student network.