Table 4 Architecture details of the generator and discriminator in the PACGAN framework for generating 256 × 256 images. The generator takes the random vector z and the embedded label of the image to be generated, \(\:cl{s}_{embedded}\) as input. It generates an image of size 256 × 256 with a variable number of channels, denoted as Nch. On the other hand, the discriminator receives images of size nch × 256 × 256 and produces two outputs: the realness of the input image, denoted as \(\:{D}_{w}\left(\cdot\:\right)\), and the relative class estimation, denoted as \(\:\widehat{y}\). Both networks consist of convolutional (Conv) and fully connected (FC) layers, with activation functions (Act.) such as leaky rectified linear activation function (LReLU), linear, or softmax.
From: High-resolution conditional MR image synthesis through the PACGAN framework
Generator | Act. | Output shape | Params | Discriminator | Act. | Output shape | Params |
---|---|---|---|---|---|---|---|
z | \(\:cl{s}_{embedded}\) | \(\:-\) | 522 \(\:\times\:\) 1 \(\:\times\:\) 1 | \(\:-\) | Input image | \(\:-\) | \(\:{n}_{ch}\) \(\:\times\:\) 256 \(\:\times\:\) 256 | \(\:-\) |
Conv 4 \(\:\times\:\) 4 | LReLU | 512 \(\:\times\:\) 4 \(\:\times\:\)4 | 4.28 M | Conv 1 \(\:\times\:\) 1 | LReLU | 16 \(\:\times\:\) 256 \(\:\times\:\) 256 | 1k |
Conv 3 \(\:\times\:\) 3 | LReLU | 512 \(\:\times\:\) 4 \(\:\times\:\) 4 | 2.36 M | Conv 3 \(\:\times\:\) 3 | LReLU | 16 \(\:\times\:\) 256 \(\:\times\:\) 256 | 4.6k |
Upsample | \(\:-\) | 512 \(\:\times\:\) 8 \(\:\times\:\) 8 | \(\:-\) | Conv 3 \(\:\times\:\) 3 | LReLU | 32 \(\:\times\:\) 256 \(\:\times\:\) 256 | 9.2k |
Conv 3 \(\:\times\:\) 3 | LReLU | 512 \(\:\times\:\) 8 \(\:\times\:\) 8 | 2.36 M | Downsample | \(\:-\) | 32 \(\:\times\:\) 128 \(\:\times\:\) 128 | \(\:-\) |
Conv 3 \(\:\times\:\) 3 | LReLU | 512 \(\:\times\:\) 8 \(\:\times\:\) 8 | 2.36 M | Conv 3 \(\:\times\:\) 3 | LReLU | 32 \(\:\times\:\) 128 \(\:\times\:\) 128 | 18.4k |
Upsample | \(\:-\) | 512 \(\:\times\:\) 16 \(\:\times\:\) 16 | \(\:-\) | Conv 3 \(\:\times\:\) 3 | LReLU | 64 \(\:\times\:\) 128 \(\:\times\:\) 128 | 37k |
Conv 3 \(\:\times\:\) 3 | LReLU | 256 \(\:\times\:\) 16 \(\:\times\:\) 16 | 1.18 M | Downsample | \(\:-\) | 32 \(\:\times\:\) 64 \(\:\times\:\) 64 | \(\:-\) |
Conv 3 \(\:\times\:\) 3 | LReLU | 256 \(\:\times\:\) 16 \(\:\times\:\) 16 | 590k | Conv 3 \(\:\times\:\) 3 | LReLU | 64 \(\:\times\:\) 64 \(\:\times\:\) 64 | 74k |
Upsample | \(\:-\) | 256 \(\:\times\:\) 32 \(\:\times\:\) 32 | \(\:-\) | Conv 3 \(\:\times\:\) 3 | LReLU | 128 \(\:\times\:\) 64 \(\:\times\:\) 64 | 147k |
Conv 3 \(\:\times\:\) 3 | LReLU | 128 \(\:\times\:\) 32 \(\:\times\:\) 32 | 295k | Downsample | \(\:-\) | 128 \(\:\times\:\) 32 \(\:\times\:\) 32 | \(\:-\) |
Conv 3 \(\:\times\:\) 3 | LReLU | 128 \(\:\times\:\) 32 \(\:\times\:\) 32 | 147k | Conv 3 \(\:\times\:\) 3 | LReLU | 128 \(\:\times\:\) 32 \(\:\times\:\) 32 | 295k |
Upsample | \(\:-\) | 128 \(\:\times\:\) 64 \(\:\times\:\) 64 | \(\:-\) | Conv 3 \(\:\times\:\) 3 | LReLU | 256 \(\:\times\:\) 32 \(\:\times\:\) 32 | 590k |
Conv 3 \(\:\times\:\) 3 | LReLU | 64 \(\:\times\:\) 64 \(\:\times\:\) 64 | 74k | Downsample | \(\:-\) | 256 \(\:\times\:\) 16 \(\:\times\:\) 16 | \(\:-\) |
Conv 3 \(\:\times\:\) 3 | LReLU | 64 \(\:\times\:\) 64 \(\:\times\:\) 64 | 37k | Conv 3 \(\:\times\:\) 3 | LReLU | 256 \(\:\times\:\) 16 \(\:\times\:\) 16 | 1.18 M |
Upsample | \(\:-\) | 64 \(\:\times\:\) 128 \(\:\times\:\) 128 | \(\:-\) | Conv 3 \(\:\times\:\) 3 | LReLU | 512 \(\:\times\:\) 16 \(\:\times\:\) 16 | 2.36 M |
Conv 3 \(\:\times\:\) 3 | LReLU | 32 \(\:\times\:\) 128 \(\:\times\:\) 128 | 18.4k | Downsample | \(\:-\) | 512 \(\:\times\:\) 8 \(\:\times\:\) 8 | \(\:-\) |
Conv 3 \(\:\times\:\) 3 | LReLU | 32 \(\:\times\:\) 128 \(\:\times\:\) 128 | 9.2k | Conv 3 \(\:\times\:\) 3 | LReLU | 512 \(\:\times\:\) 8 \(\:\times\:\) 8 | 2.36 M |
Upsample | \(\:-\) | 32 \(\:\times\:\) 256 \(\:\times\:\) 256 | \(\:-\) | Conv 3 \(\:\times\:\) 3 | LReLU | 512 \(\:\times\:\) 8 \(\:\times\:\) 8 | 2.36 M |
Conv 3 \(\:\times\:\) 3 | LReLU | 16 \(\:\times\:\) 256 \(\:\times\:\) 256 | 4.6k | Downsample | \(\:-\) | 512 \(\:\times\:\) 4 \(\:\times\:\) 4 | \(\:-\) |
Conv 3 \(\:\times\:\) 3 | LReLU | 16 \(\:\times\:\) 256 \(\:\times\:\) 256 | 2.3k | Minibatch stddev | \(\:-\) | 513 \(\:\times\:\) 4 \(\:\times\:\) 4 | \(\:-\) |
Conv 1 \(\:\times\:\) 1 | linear | \(\:{n}_{ch}\) \(\:\times\:\) 256 \(\:\times\:\) 256 | 1.5k | Conv 3 \(\:\times\:\) 3 | LReLU | 512 \(\:\times\:\) 4 \(\:\times\:\) 4 | 2.36 M |
Tot params | 13.7 M | Conv 4 \(\:\times\:\) 4 | LReLU | 512 \(\:\times\:\) 1 \(\:\times\:\) 1 | 4.2 M | ||
FC \(\:\left({D}_{w}\left(\cdot\:\right)\right)\) | linear | 1 \(\:\times\:\) 1 \(\:\times\:\) 1 | 513 | ||||
FC | linear | 150 \(\:\times\:\) 1 \(\:\times\:\) 1 | 76.9k | ||||
FC \(\:\left(\widehat{y}\right)\) | Softmax | \(\:{n}_{classes}\) \(\:\times\:\) 1 \(\:\times\:\) 1 | 151 | ||||
Tot params | 16.1 M |