Extended Data Fig. 5: Comparing DNN encoding performance across different models. | Nature Neuroscience

Extended Data Fig. 5: Comparing DNN encoding performance across different models.

From: Dissecting neural computations in the human auditory pathway using deep neural networks for speech

Extended Data Fig. 5

The distribution of the normalized brain prediction score of the best-performing neural encoding model based on each single layer in the DNN model (maximum over delay window length) across individual electrodes. a) Wav2Vec 2.0 Unsupervised (SSL) model; b) Wav2Vec 2.0 Supervised finetuning (SSL + FT) model; c) HuBERT Unsupervised (SSL) model; d) HuBERT pure supervised model. Each column corresponds to one area in the auditory pathway, from left to right AN/IC/HG/STG. Magenta bars indicate CNN output layers, cyan bars indicate Transformer layers. Red star (*) indicates the best model for each area, black dot (.) indicates other models that are not statistically different from the best model (p > 0.05, two-sided paired t-test). Box plot shows the first and third quantiles across electrodes, orange line indicates the median, black line is the mean value, and whiskers indicate the 5th and 95th percentiles.

Source data

Back to article page