Table 1 Parameters of the computational model

Hyperparameter	Value/Description
Batch size	512 (number of transitions sampled from the replay buffer)
Discount factor (γ)	0.1 (importance of future rewards)
Epsilon (ϵ)	0.9, decays to 0.0001 over 1000 episodes
Learning rate	1 × 10⁻⁴ (for the AdamW optimizer)
Soft update rate (τ)	0.005 (rate at which the target network is updated)
Number of observations (N_o)	72 (NN input)
Number of actions (N_a)	16 (NN output)
Time step (Δt)	10 min
Cell velocity (v_c)	0.2 μm/min (mean velocity of the cell from¹⁷)
Radius of the cell (R_c)	10 μm
Radius of the cell nucleus (R_N)	4 μm

Quick links

Search