Extended Data Fig. 1: Network architecture.
From: Deep learning-enhanced light-field imaging with continuous validation

This architecture was used for beads and neural activity volumes. For the medaka heart, slightly different layer depth was used with the same overall structure (see Supplementary Table 1). Res2/3d: residual blocks with 2d or 3d convolutions with kernel size (3×)3 × 3. Residual blocks contain an additional projection layer (1 × 1 or 1 × 1 × 1 convolution) if the number of input channels is different from the number of output channels. Up2/3d: transposed convolution layers with kernel size (3×)2 × 2 and stride (1×)2 × 2. Proj2d/3d: projection layers (1 × 1 or 1 × 1 × 1 convolutions). The numbers always correspond to the number of channels. With 19 × 19 pixel lenslets (nnum = 19) the rearranged light field input image has 192 = 361 channels. The affine transformation layer at the end is only part of the network when training on dynamic, single plane targets; otherwise, in inference mode it might be used in post-processing to yield a SPIM aligned prediction, or the inverse affine transformation is applied to the SPIM target for static samples to avoid unnecessary computations.