Table 1 Detailed architecture of the proposed student model.

From: Knowledge distillation-based lightweight MobileNet model for diabetic retinopathy classification

Layer Type

Output Shape

Parameters

Input Layer

(512, 512, 3)

0

Conv2D (3\(\times\)3, 32 filters, stride 2)

(256, 256, 32)

896

Batch Normalization

(256, 256, 32)

128

ReLU Activation

(256, 256, 32)

0

Block 1 (64 filters, stride 1)

Depthwise Conv2D (3\(\times\)3)

(256, 256, 32)

320

Batch Normalization

(256, 256, 32)

128

ReLU Activation

(256, 256, 32)

0

Pointwise Conv2D (1\(\times\)1, 64 filters)

(256, 256, 64)

2,112

Batch Normalization

(256, 256, 64)

256

ReLU Activation

(256, 256, 64)

0

Block 2 (64 filters, stride 2)

Depthwise Conv2D (3\(\times\)3)

(128, 128, 64)

640

Batch Normalization

(128, 128, 64)

256

ReLU Activation

(128, 128, 64)

0

Pointwise Conv2D (1\(\times\)1, 64 filters)

(128, 128, 64)

4,160

Batch Normalization

(128, 128, 64)

256

ReLU Activation

(128, 128, 64)

0

Block 3 (128 filters, stride 1)

Depthwise Conv2D (3\(\times\)3)

(128, 128, 64)

640

Batch Normalization

(128, 128, 64)

256

ReLU Activation

(128, 128, 64)

0

Pointwise Conv2D (1\(\times\)1, 128 filters)

(128, 128, 128)

8,320

Batch Normalization

(128, 128, 128)

512

ReLU Activation

(128, 128, 128)

0

Block 4 (128 filters, stride 2)

Depthwise Conv2D (3\(\times\)3)

(64, 64, 128)

1,280

Batch Normalization

(64, 64, 128)

512

ReLU Activation

(64, 64, 128)

0

Pointwise Conv2D (1\(\times\)1, 128 filters)

(64, 64, 128)

16,512

Batch Normalization

(64, 64, 128)

512

ReLU Activation

(64, 64, 128)

0

Block 5 (128 filters, stride 1)

Depthwise Conv2D (3\(\times\)3)

(64, 64, 128)

1,280

Batch Normalization

(64, 64, 128)

512

ReLU Activation

(64, 64, 128)

0

Pointwise Conv2D (1\(\times\)1, 128 filters)

(64, 64, 128)

16,512

Batch Normalization

(64, 64, 128)

512

ReLU Activation

(64, 64, 128)

0

Fully Connected Layers

Global Average Pooling 2D

(128,)

0

Dense (128 units, ReLU)

(128,)

16,512

Dense (2 units)

(2,)

258

Dense (3 units)

(3,)

387

Parameter Summary

Dense (2 units)

Trainable

71,362

Non-trainable

1,920

Total

73,282

Dense (3 units)

Trainable

71,491

Non-trainable

1,920

Total

73,411