Fig. 3: System model for transmission based on the hyperprior model.

The encoder ga downsamples the input image (W × H) by a factor of 16 to produce a latent representation Y of size \(\frac{W}{16}\times \frac{H}{16}\times N\). The system retains k% of feature maps while masking the remainder in Y and Z at the transmitter, and the reconstruction by padding zeroes in the masked 100 − k% feature maps at the receiver. The selection of k is based on channel conditions and service requirements.