Table 7 Effect of CBKD combined with other knowledge distillation methods.

From: Counterclockwise block-by-block knowledge distillation for neural network compression

 

CIFAR-10

Tiny-imagenet-200

Method

VGG-16 (88.5%)

Resnet-18 (85.0%)

VGG-16 (59.4%)

Resnet-18 (57.3%)

Student

87.3%

83.8%

57.6%

54.6%

CBKD

88.4%

85.6%

59.4%

55.7%

KD

87.2%

84.2%

57.6%

54.5%

KD + CBKD

88.6%

86.0%

59.6%

55.8%

FitNets

88.6%

86.6%

58.6%

54.9%

FitNets + CBKD

88.8%

86.3%

59.3%

55.6%

RWD

87.3%

84.7%

58.6%

54.9%

RWD + CBKD

88.5%

85.4%

59.3%

56.0%

DKD

87.9%

85.1%

58.2%

55.1%

DKD + CBKD

88.9%

86.6%

59.4%

55.6%

L-S-KD + KD

88.1%

85.5%

59.4%

55.5%

L-S-KD + KD + CBKD

88.7%

86.0%

59.8%

55.9%