Table 1 Training hyper-parameters

From: Modeling attention and binding in the brain through bidirectional recurrent gating

 

n-Epochs

Batch-size

Learning rate η

η Scheduler milestones

scheduler γ

L2-rate λ

MNIST

96

128

5 × 10−4

[32, 64]

0.2

1 × 10−6

COCO

48

128

5 × 10−4

0.25

OneCycleLR

1 × 10−5

CelebA

32

128

2 × 10−4

[16, ]

0.1

5 × 10−4

Contrast Detect.

32

64

1 × 10−4

0.25

OneCycleLR

5 × 10−4

Contrast Discrim.

32

64

1 × 10−4

0.25

OneCycleLR

5 × 10−4

Ori. Change Detect.

64

64

5 × 10−4

0.25

OneCycleLR

1 × 10−4

Fig-Grnd-Sep

64

64

2 × 10−4

0.125

OneCycleLR

1 × 10−4

Curve Tracing

64

128

5 × 10−4

1 × 10−6

CIFAR-100

64

64

5 × 10−4

0.125

OneCycleLR

1 × 10−4

Multi-Modal Search

64

64

1 × 10−4

5 × 10−5

  1. Contrast Detect.: Contrast detection. Contrast Discrim.: Contrast discrimination. Ori. Change-Detect.: Orientation-change detection.