Supplementary Figure 7: Computational Modeling approach.
From: Explicit information for category-orthogonal object properties increases along the ventral stream

(a) In previous work (Yamins, D., and Hong, H. et al. (2014) Proc Natl Acad Sci USA 111, 8619–24), we developed a model of areas in higher ventral cortex. Our approach was based on optimizing model parameters to maximize performance of the model’s output layer on a challenging object recognition task involving real-world images in a large number of semantically distinct categories. (Note: The cat-vs-dog task shown in the figure is just for illustration purposes; the actual task on which the model was optimized contained no animal images.) This model was then used to predict neural responses in macaque V4 and IT cortex, recorded on the images shown in Supplementary Fig. 1. (b) The top layer of the performance-optimized model generalizes from the photographic training set (red bar), and significantly outperforms control models (gray bars) and the V4 neural population on the 8-way object categorization task (Animals vs Boats vs Cars vs Chairs vs Faces vs Fruits vs Planes vs Tables) in the images shown in Supplementary Fig. 1a. Model performance is comparable to IT neural population (blue bar) and human performance measured via psychophysical experiments (black bar). (c) The performance-optimized model is then used predict neural response in IT cortex (top panel) and V4 cortex (bottom bars). Ability to predict IT neural patterns is better with each subsequent model layer, peaking at the top layer (red bars), whereas ability to predict V4 neurons peaks in the middle layers. For both V4 and IT, the performance-optimized model’s most predictive layer is significantly better than other control models, including ideal observers that perform perfectly on categorization tasks (purple bars) as well as control models that are also in the general class of neural networks (gray bars). All panels in this figure are related to previously published results (Yamins, D., and Hong, H., et al. (2014) Proc Natl Acad Sci USA 111, 8619–24). (d) Computational model training set performance timecourses. x axis represents amount of training data during model training, for the computational model used in Fig. 6 and Supplementary Figs. 8,9,10. y axis represents error rate, measured in percent incorrect. Gray line is performance on a held-out subset of the images in the overall training image set; black line is performance on actual subset of images on which the model was trained. (See Methods section for details of training.)