Table 7 Effect of CBKD combined with other knowledge distillation methods.
From: Counterclockwise block-by-block knowledge distillation for neural network compression
| Â | CIFAR-10 | Tiny-imagenet-200 | ||
|---|---|---|---|---|
Method | VGG-16 (88.5%) | Resnet-18 (85.0%) | VGG-16 (59.4%) | Resnet-18 (57.3%) |
Student | 87.3% | 83.8% | 57.6% | 54.6% |
CBKD | 88.4% | 85.6% | 59.4% | 55.7% |
KD | 87.2% | 84.2% | 57.6% | 54.5% |
KD + CBKD | 88.6% | 86.0% | 59.6% | 55.8% |
FitNets | 88.6% | 86.6% | 58.6% | 54.9% |
FitNets + CBKD | 88.8% | 86.3% | 59.3% | 55.6% |
RWD | 87.3% | 84.7% | 58.6% | 54.9% |
RWD + CBKD | 88.5% | 85.4% | 59.3% | 56.0% |
DKD | 87.9% | 85.1% | 58.2% | 55.1% |
DKD + CBKD | 88.9% | 86.6% | 59.4% | 55.6% |
L-S-KD + KD | 88.1% | 85.5% | 59.4% | 55.5% |
L-S-KD + KD + CBKD | 88.7% | 86.0% | 59.8% | 55.9% |