Fig. 8: Cerebro-cerebellar model facilitates learning in a visual-language task. | Nature Communications

Fig. 8: Cerebro-cerebellar model facilitates learning in a visual-language task.

From: Cerebro-cerebellar networks facilitate learning through feedback decoupling

Fig. 8

a Schematic of the model used in a visual-language task. The image is first processed by a (pretrained) convolutional neural network modelling the visual cortex. The resulting feature vector is then provided to the cerebral RNN which is trained to predict the next word given the previous words of a provided “gold standard” image caption. The cerebellum module \({{{{{{{\mathcal{C}}}}}}}}\) is only applied to the cRNN. Top left: task structure with example input image and words (green), ccRNN output words (orange) and target caption (red). b Learning curves in bits per word (BPW), lower values indicate better understanding of the language on validation set for cerebral feedback horizon of four timesteps (inset shows complete learning curve). c Two example images from the validation set with corresponding model captions and gold standard captions (black). The images shown here were generated on deepAI.org for illustration purposes only. d Normalised model performance across different degrees of feedback horizon in the cerebral network (ns denotes not significant: p = 0.891 (40%), p = 0.116 (45%)). e Normalised caption score (see the “Methods” section) as a function of caption length (ns: p = 0.075 (short), p = 0.189 (medium)). *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 (two-sided paired t-test between cRNN and ccRNN). Error bars represent mean ± SEM across 10 different initial conditions. Source data are provided as a Source Data file.

Back to article page