The impact of spatial frequency on hierarchical category representation in macaque temporal cortex

Farhang, Esmaeil; Toosi, Ramin; Karami, Behnam; Koushki, Roxana; Kheirkhah, Narges; Shakerian, Farideh; Noroozi, Jalaledin; Rezayat, Ehsan; Vahabie, Abdol-Hossein; Dehaqani, Mohammad-Reza A.

doi:10.1038/s42003-025-08230-5

Download PDF

Article
Open access
Published: 25 May 2025

The impact of spatial frequency on hierarchical category representation in macaque temporal cortex

Communications Biology volume 8, Article number: 801 (2025) Cite this article

1654 Accesses
2 Altmetric
Metrics details

Subjects

Abstract

Objects are recognized in three hierarchical levels: superordinate, mid-level, and subordinate. Psychophysics shows that mid-level categories and low spatial frequency (LSF) information are rapidly recognized. However, the interaction between spatial frequency (SF) and abstraction is not well understood. To address this, we examine neural responses in the inferior temporal cortex and superior temporal sulcus of two male macaque monkeys. Our findings reveal that mid-level categories are well represented at both LSF and high SF (HSF), suggesting robust mid-level boundary maps in these areas, unaffected by SF changes. Conversely, superordinate category representation depends on HSF, indicating its crucial role in encoding global category information. The absence of subordinate representation in both LSF and HSF compared to intact stimuli further implies that full SF content is essential for fine-category processing. A supporting human psychophysics task confirms that superordinate categorization relies on HSF, while subordinate object recognition requires both LSF and HSF.

Flexible time course of spatial frequency use during scene categorization

Article Open access 07 July 2021

Chemogenetic dissection of a prefrontal-hypothalamic circuit for socially subjective reward valuation in macaques

Article Open access 20 July 2023

High-dimensional topographic organization of visual features in the primate temporal lobe

Article Open access 22 September 2023

Introduction

Object recognition, a fundamental aspect of visual perception, operates at varying levels of abstraction, including the super- ordinate (e.g., animate vs. inanimate), mid-level (e.g., face vs. body), and sub-ordinate (e.g., face identity) categories. There exist numerous studies on hierarchical object recognition^1,2,3,4. On the other hand, spatial frequency (SF), ranging from low (coarse shapes) to high (fine details), significantly influences visual perception, highlighting its pivotal role in the neural representation of objects^5,6,7,8. However, the role of spatial frequency (SF) in shaping visual perception, specifically in hierarchical object recognition, is not well understood. Understanding the role of SF in hierarchical object recognition is essential for uncovering the neural mechanisms that enable rapid and efficient visual processing in both natural and artificial vision systems. Clarifying how SF bands contribute to distinct levels of abstraction could inform both neuroscientific theory and practical applications in AI and visual perception models.

Biologically, our visual system is adept at utilizing different SF bands for various categorization tasks, implying an evolutionary adaptation to process visual information efficiently. This adaptability is evident in tasks like house vs. flower categorization, which is more effective at high SFs (HSF) (SF < 8 cycles image⁻¹), or flower vs. face categorization, whichshows better performance at low SFs (LSF) (SF > 24 cycles image⁻¹)⁹. These distinctions imply that our visual system may have evolved to use specific SF bands as shortcuts for different categorization tasks, enabling rapid and contextually appropriate responses. Such preferences underscore the importance of SF in optimizing neural resources for various recognition tasks, depending on the category demands. Moreover, studies show that SF can be explicitly decoded from neuronal responses in the Occipitotemporal cortex¹⁰ and the inferior temporal (IT) cortex¹¹. Cheung and Bar showed that LSF bands of image (SF < 8 cycles image⁻¹) drive top-down predictions that facilitate object recognition¹². The middle frequencies (16–24 cycles image⁻¹) are shown to be important for natural scene categorization (subjects need to determine the presence or absence of a car in a natural scene)¹³. Caplette et al. indicated that a delayed match to sample task using randomly filtered stimuli shows that objects are better represented in the middle SF bands (13.8–24 cycles image⁻¹)¹⁴. Middle and high frequencies are also suggested as the best SF band for face recognition. Studies show that the appropriate frequency band for face recognition is between 5 and 25 cycles face⁻¹ ^{15,16,17,18,19}. These findings collectively underscore that the visual system’s reliance on distinct SF bands for different tasks likely reflects an adaptive optimization of processing efficiency. By selectively recruiting certain SF ranges—such as LSF for broad categorizations or middle-to-high SFs for detailed recognition like faces—the brain minimizes computational demands and enhances speed in processing complex visual scenes. This selective SF utilization highlights a sophisticated neural mechanism for prioritizing certain visual features based on task requirements, underscoring the importance of SF in shaping hierarchical perception.

Moreover, psychophysical evidence has shown poor performance and higher reaction time in super-ordinate level catego- rization (living vs. non-living) in LSF²⁰, suggesting the necessity of finer shape information content for super-ordinate level categorization. However, Ashtiani et al. suggest the better performance of super-ordinate level categorization (animal/non- animal) in LSFs compared to the mid and sub-ordinate levels, which appear to require finer shape information in higher SFs and take longer to be processed²¹. Studies indicate that the visual system uses various SF bands to perform the categorization tasks, which could be affected by the task and involved objects, suggesting a biological basis for the differential impact of SF bands on various category levels. Therefore, visual perception is greatly affected by both the SF contents and the level of abstraction. Despite the extensive examination of object recognition at various abstraction levels, the specific role of SF in modulating hierarchical categorization has not been fully explored. By addressing this interaction, this study aims to shed light on how the visual system dynamically allocates processing resources across SF bands to achieve optimal recognition across different levels, revealing underlying mechanisms in neural processing.

The IT cortex lies at the end of the ventral visual pathway in nonhuman primates and contains various category-selective neurons^1,22,23,24. A faster representation of more abstract versions^25,26 and mid-level categories compared to super- and sub- ordinate levels⁴ has been suggested in this region; however, the effect of SF on the representation of the categories in different levels of abstraction requires further investigation. This study is motivated by the hypothesis that different spatial frequency (SF) bands play distinct roles in the biological process of object categorization. Specifically, we explore how low spatial frequencies (LSF, 1–10 cycles image⁻¹) and high spatial frequencies (HSF, 18–75 cycles image⁻¹) differentially contribute to categorizing objects at various levels of abstraction (super-ordinate, mid-level, and sub-ordinate). To examine the effect of SF content on categorical representation at multiple levels, the responses of IT and STS neurons to various visual stimuli in three levels of abstraction and three levels of SF content (i.e., intact, high, and low) are recorded. We show that mid-level categories (e.g., face-body) are represented at all levels of SF content, while in sub-ordinate levels (e.g., identity), only the intact stimuli are represented in IT and STS neurons. We observed that the representation of animacy level (i.e., super-ordinate level) requires fine shape information (HSF contents), whereas there is no super-ordinate level information in LSF stimuli. Therefore, both HSF and LSF information are necessary for the sub-ordinate representation, while only the HSF information is essential for the super-ordinate levels of abstraction. Thus, our findings suggest the necessity of finer shape information (i.e., HSF content) for global object categorization (super-ordinate). Furthermore, fine and coarse shape information (i.e., HSF and LSF, respectively) are critical for finer categories in visual processing. Finally, we conducted a human psychophysics task to evaluate our findings in human decision-making. We found that, as in the neurophysiological data, super-ordinate categorization can be done in HSF similar to intact, while LSF decreases the performance. At the sub-ordinate level, both removing LSF and HSF components degrades the performance. Furthermore, a faster reaction time of mid-level stimuli is observed in all SF ranges.

Results

To investigate the impact of SF content on hierarchical object categorization using neural activities of the IT and STS areas, we designed a rapid serial visual presentation (RSVP)^27,28,29 (stimulus duration, 50 ms; interstimulus interval, 450 ms) task (Fig. 1b). Eighty-one images of real-world objects in three SF levels (Fig. 1c) were displayed to passive-viewing monkeys while individual IT neurons were recorded. The cells were located across the IT cortex and STS (Fig. 1a). For simplicity, we refer to this combined area as “IT” throughout the paper; however, “IT” here includes all recorded neurons from both the IT cortex and the STS. Results rely on the neural population response analysis for all stimuli. The stimuli were images of eighty-one grayscale natural and artificial objects. The organization of visual object categories at three levels of abstraction enables us to examine the category representation at different levels of abstraction hierarchy for each SF content (Fig. 1c). We divided the stimuli into animate and inanimate categories. We then divided the animate stimuli into faces and bodies, and human faces into female and male faces⁴. This categorical sort is compatible with prior studies about natural categorical representations in the monkey and human IT^24,30,31,32. Throughout the results, only neurons that were responsive across all stimuli were included in the analysis to ensure reliability. A neuron was considered responsive if its average firing rate between 70 and 170 ms after the onset of all stimuli was significantly higher than its firing rate during a baseline period, spanning from 50 ms before to 50 ms after stimulus onset. In total, there exist 330 responsive neurons (45 were recorded from the STS), and all population analyses are conducted on all 330 responsive neurons.

Global category information needs fine shape information

To show the effect of the SF filtering on IT representations, we investigate the changes in neuron responses in LSF and HSF compared to the intact representations using the neuron response modulation index (RMI). RMI shows the difference between two variables (e.g., firing rates for LSF and intact) normalized to their sum. Figure 2a shows the index for various categories under LSF and HSF conditions. HSF significantly affects the neuron responses compared to the LSF (animate: LSF = 0.006 ± 0.01 (p = 0.16), HSF = 0.063 ± 0.01 (p = 2e-9), and HSF > LSF: p = 5e-50, inanimate: LSF = 0.01 ± 0.013 (p = 0.22), HSF = 0.080 ± 0.01 (p = 7e-11), and HSF > LSF: p = 7e-44, face: LSF = 0.025 ± 0.01 (p = 0.01), HSF = 0.066 ± 0.01 (p = 9e-9), and HSF > LSF: p = 1e-44, body: LSF = −0.003 ± 0.01 (p = 0.72), HSF = 0.068 ± 0.01 (p = 2e-11), and HSF > LSF: p = 1e-46). While the impact of HSF (or removing of LSF components does not vary significantly among the categories (p = 0.80, one-way analysis of variance (ANOVA)), LSF (or removing of HSF components) only affects the face category neuronal responses significantly.

**Fig. 2: The effect of SF on hierarchical category representation.**

Next, the impact of SF filtering on population decoding performance is investigated using a support vector machine (SVM) classifier. The impact of SF on 2D hierarchical category representation is illustrated in Supplementary Fig. 1. The time course of the accuracy of the classifier on each abstraction level using the leave-p-out method is illustrated in Fig. 2b-d. The peak accuracy rate for all levels of abstraction in intact (mid = 0.92 ± 0.025, super = 0.90 ± 0.018, sub = 0.81 ± 0.038) shows the generalizability of representations in full SF. In LSF (Fig. 2c), the performance of mid-level (face vs. body) is less affected compared to the performances of super- and sup-ordinate categorizations that are highly decreased (mid = 0.82 ± 0.032, super = 0.64 ± 0.028, sub = 0.56 ± 0.041), suggesting the importance of HSF for both super- and sub-ordinate levels. On the other hand, in HSF (Fig. 2d), the accuracy rate of both mid-level and super-ordinate categories are close to the intact condition, while the performance in sub-ordinate level is decreased as in LSF (mid = 0.87 ± 0.025, super = 0.85 ± 0.026, sub = 0.58 ± 0.050). These observations suggest the robustness of mid-level decoding capabilities in all SF ranges, while HSF is necessary for the super-ordinate level, and both LSF and HSF have vital information for sub-ordinate categorization. While the SVM method shows the generalizability of the representations, it limits the samples to correctly or wrongly classified and does not consider their relative distance. To overcome this limit, we use the Separability Index (SI)⁴ to understand better the impact of SF on population decoding capabilities in three levels of abstraction.

Using a Separability Index (SI)⁴ that determines the proportion of between-category and within-category distances of stimuli based on the IT responses, we showed the strength and reliability of category discrimination in IT neural populations. Figure 3 (a-left) demonstrates the time course of SI for three abstraction levels for the intact SF condition. This figure illustrates that SI leads to significant values in all categories after the stimulus onset. Figure 3 (a-right) shows that the SI value for mid-level is significantly higher than super-ordinate and sub-ordinate levels (SI value for mid-level = 1.67 ± 0.09, super-ordinate = 0.27 ± 0.01, sub-ordinate = 0.32 ± 03, mid-level vs. super- or sub- ordinate p < 0.001, super-ordinate vs. sub-ordinate p = 0.11). To analyze the time course of category representations in different SF bands, time courses, and SI values in LSF and HSF are presented in Fig. 3b, c. Moreover, in the LSF, only the SI value for the mid-level is significant (SI value in LSF for mid-level = 0.87 ± 0.05, super-ordinate = 0.11 ± 0.06, sub-ordinate = 0.15 ± 0.11; compared to baseline mid-level p < 0.001, super-ordinate p = 0.09, and sub-ordinate p = 0.47; onset time for mid-level = 99.81 ± 1.63).

**Fig. 3: Time course and separability index (SI) values for each level of abstraction in intact (I), low spatial frequency (L), and high SF (H) stimuli for the population of recorded IT neurons.**

As shown in Fig. 3b, in the LSF condition, SI leads to significant values at mid-level abstraction after stimulus onset, while it does not lead to significant values at the super-ordinate and sub-ordinate levels of abstraction. Concerning LSF content (Fig. 3b right), the SI value is still significant at the mid-level; however, we failed to find any statistically significant discrimination at the super-ordinate and sub-ordinate levels. In the HSF, Fig. 3c, SI leads to significant values at the mid-level and super-ordinate levels of abstraction (SI value in HSF for mid-level = 1.44 ± 0.08, super-ordinate = 0.29 ± 0.02; compared to baseline mid-level and super-ordinate p < 0.001). In contrast, SI does not lead to significant values at the sub-ordinate level of abstraction (SI value in HSF for sub-ordinate = 0.15 ± 0.05; compared to baseline p = 0.24). Further, as illustrated in Fig. 3, the SI value for the mid-level is significantly higher than that for the super-ordinate and the sub-ordinate levels in all SFs (in all SFs, mid-level vs. super- or sub-ordinate p < 0.001). Furthermore, the sub-ordinate level does not contain significant information in LSF or HSF. Therefore, we concluded that mid- and super-ordinate level representations were maintained after the LSF removal, while merely mid-level representations persevered when HSF was removed. In the intact stimuli, all abstraction levels were represented. Nevertheless, the representative information for mid-level is significantly higher than super-ordinate and sub-ordinate levels.

The results of this section indicate that the mid-level categories are well represented in all SF bands in the IT cortex. However, the presence of HSF contents is critical for representing information at the super-ordinate level since the removal of the HSF contents (LSF condition) significantly impacts the super-ordinate level information. The representation of the sub-ordinate level is highly affected by both LSF and HSF filtering, suggesting that all SF bands are necessary for the sub-ordinate level. The observations suggest that the global object information requires coarser shape information, and the fine object information requires a variety of SF contents ranging from LSF to HSF. The results presented in the paper are consistent when examining individual animals, with Supplementary Fig. 2 emphasizing the main discovery separated for the two monkeys in this study.

Since IT is a large area, the recording location plays a crucial role in our study. For this reason, we pinpointed neurons spread across the anterior IT (AIT), central IT (CIT), and posterior IT (PIT) regions. Interestingly, our findings remained consistent across different recording sites as outlined in Supplementary Fig. 3.

Interaction between SF and category in the mid-level of abstraction

To individually investigate the representation of mid-level categories, i.e., body and face, the time course of the SI of these categories and the animate category was calculated against the inanimate (Fig. 4). These calculations are done for HSF, Intact, and LSF stimuli. According to Fig. 4a, the discrimination of body is higher than face and animate categories in intact (peak SI value for animate=0.27 ± 0.01, face=0.71 ± 0.03, and body = 1.12 ± 0.05; body vs. face p < 0.001, face vs. animate p < 0.001). This pattern is further repeated for HSF stimuli (peak SI value for animate = 0.29 ± 0.02, face = 0.47 ± 0.02, and body = 1.26 ± 0.05; body vs. face p < 0.001 and face vs. animate p < 0.001) (Fig. 4c). In LSF (Fig. 4b), the SI value for body drops significantly while the face information is comparable to HSF (Peak SI value for animate is not significant against the baseline with p = 0.09, face = 0.45 ± 0.04, and body = 0.32 ± 0.02; peak SI value for the body in intact vs. LSF p < 0.001 and HSF vs. LSF p < 0.001; face in intact vs. HSF p < 0.001 and HSF vs. LSF p = 0.28).

**Fig. 4: Temporal dynamics of the separability index (SI) for face, body, and animate vs. inanimate, and the magnitude of SI differences between the intact and both HSF and LSF conditions.**

To investigate the effect of SF filtering on information, we use the SF modulation index (SMI). In SMI, to measure the impact of SF on the information of face, body, and animate categories, SI values in HSF and LSF are evaluated against the intact condition. SMI shows the proportion of information in the given category that HSF or LSF contents do not contain. Consequently, the more considerable value in SMI associated with each SF and category shows that those specific categories do not use that SF content to represent that category. In contrast, the smaller values suggest that the neural population uses SF content to represent that category. Figure 4d shows the SMI value for three categories. Left to right, the first two bar plots illustrate the SMI for the animated category in LSF and HSF, respectively. As mentioned, the animate group has no significant SI value in LSF, so removing HSF content considerably degrades animate category discrimination. However, the SMI value shows no significant difference between intact and HSF (animate SMI value for HSF = −0.01 ± 0.02 and p = 0.18). Face information loss in LSF is slightly greater than in HSF (SMI for LSF = 0.26 ± 0.03 and HSF = 0.19 ± 0.02; SMI in LSF vs. HSF p = 0.03). The statistical test implies that the IT neuronal population represents the face category in both LSF and HSF with roughly similar information.

Furthermore, the body discrimination in HSF has been maintained and improved, while in LSF, the information drops significantly (SMI value for LSF = 0.54 ± 0.01 with p < 0.001 and HSF = −0.06 ± 0.02 with p < 0.01; SMI value in HSF vs. LSF p < 0.001). This observation suggests that the body is well represented in HSF. Face and body observations demonstrate an interaction between SF and category, even at the same level of abstraction (mid-level here).

The unsupervised data-driven approach confirms the interaction between SF and abstraction level

Hierarchical clustering analysis is an unsupervised method that assumes some categorical structure for the data, yet it does not imply any unique grouping into categories³¹. Clustering scores can be viewed as an unsupervised grouping representation. If the clustering score for a given category is high, then the representation of the samples within that category will be more similar than that of the samples in other categories. Accordingly, each category could be analyzed separately based on the clustering scores.

Hierarchical cluster trees have been computed for the IT response patterns, as shown in Fig. 5a; furthermore, they have been computed in the early and late phases of the responses for each SF. In the early phase for intact, 80–110 ms after stimulus onset, the colorful bullet points on the clustering tree exhibit that face and inanimate categories started to be clustered. Furthermore, the body category was clustered at the late phase, i.e., 155–185 ms, and the face and inanimate categories were also clustered at that time interval. A similar pattern can be observed in HSF and LSF. To quantify these observations, each row in Fig. 5b describes the clustering score for each category calculated between 70 and 320 ms after the stimulus onset for intact, LSF, and HSF contents, respectively. Furthermore, clustering scores for all categories in the late and early phases are reported in Table 1. According to Fig. 5b, the clustering score of the face is significantly higher than that of the body in intact stimuli (clustering score of face = 0.096 ± 0.005 and body = 0.086 ± 0.005; face vs. body p = 0.02). These patterns are preserved in both HSF and LSF (clustering score in LSF for face = 0.055 ± 0.004 and body = 0.048 ± 0.002; face vs. body p = 0.03; in HSF face = 0.068 ± 0.004 and body = 0.052 ± 0.003; face vs. body p < 0.001). Furthermore, the clustering score for body and face (i.e., mid-level) is significantly higher than other categories, consistent with the previous supervised approach (clustering score for intact: animate = 0.021 ± 0.009, inanimate = 0.056 ± 0.003, identity1 = 0.014 ± 0.002, identity2 = 0.018 ± 0.005; face/body vs. animate/inanimate/identity1/identity2 p < 0.001; for LSF: animate = 0.010 ± 0.001, inanimate = 0.009 ± 0.002, identity1 = 0.032 ± 0.002, identity2 = not significant; face/body vs. animate/inanimate/identity1/identity2 p < 0.001; for HSF: animate = 0.035 ± 0.001, inanimate = 0.047 ± 0.001, identity1 = 0.053 ± 0.002, identity2 = 0.013 ± 0.001; face vs. animate/inanimate/identity1/identity2 p < 0.001 and body vs. animate/identity2 p < 0.01, body vs. inan-imate p = 0.09, body vs. identity1 p = 0.43). The clustering score of identity1 is significantly higher in LSF and HSF than the intact (intact = 0.014 ± 0.002, LSF = 0.032 ± 0.002, HSF = 0.053 ± 0.002; HSF vs. LSF p < 0.001, LSF vs. intact p < 0.001). Similar to the supervised method, the clustering score of the super-ordinate categories (animate and inanimate) in LSF is significantly lower than intact and HSF (intact: animate = 0.021 ± 0.009, inanimate = 0.056 ± 0.003, LSF: animate = 0.010 ± 0.001, inanimate = 0.009 ± 0.002, and HSF: animate = 0.035 ± 0.001, inanimate = 0.047 ± 0.001; intact/HSF vs. LSF p value for both categories <0.001). According to these observations, an interaction between SF and abstraction level could improve or degrade a category’s discriminability in neural representation. The time windows for the early and late phases (as shown in Figs. 5 and 6 and Table 1) were chosen according to the onset and peak times of each frequency level to more effectively highlight any differences.

**Fig. 5: Hierarchical clustering and clustering scores of inferior temporal (IT) responses for Intact, LSF, and HSF.**

Table 1 Degrees of the match between tree nodes and categories, reconstructed from responses of inferior temporal (IT) cells in the early and late phase of neural responses for Intact, Low Spatial Frequency (SF) and High SF stimulis

Full size table

Fig. 6: Representational dissimilarity matrices (RDMs) and the time course of correlations with the model for monkeys’ inferior temporal (IT) cortex responses at three spatial frequencies (SFs) and different time intervals.

Representational dissimilarity matrices (RDMs) are constructed by comparing all stimuli to one another. As a multivariate method relevant to population vector analysis, RDM can elicit information about scattered representation patterns throughout the brain regions. By applying RDM, we can examine the primary representational organization of information within the brain activity patterns; providing a framework to test assumptions about the information’s construction. The initial presumption of RDM is that the stimuli with higher similar representations remain more arduous to decode. Figure 6a shows the empirical representational dissimilarity matrices (RDMs) averaged across stimuli for intact, LSF, and HSF in three intervals. Each cell in the matrix describes the dissimilarity measured by correlation distance (i.e., 1 – r, where r is Pearson correlation coefficient) between the IT activation patterns for one pair of stimuli. In this figure, at the first row for the intact stimuli, the time interval for 80–110 ms is representative of onset latency; whenever one stimulus is face, and the other belongs to the non-face category, it leads to a considerable dissimilarity. In the second interval, 155–185 ms, the within-dissimilarity for face and body categories decreases, and the between-dissimilarity becomes more prominent; this pattern is also observed among the body versus inanimate categories. Also, face identities have small within-dissimilarity and considerable between-dissimilarity.

The second row shows the RDM representations for LSF stimuli in which, during 85–115 ms after stimulus onset, within-dissimilarity of the inanimate group is decreased. In addition, at 165-195 ms, the within-dissimilarity of faces and non-faces is reduced, and the dissimilarity between them is grown. Moreover, the between-dissimilarity of faces and bodies is increased, and their within-dissimilarity is decreased. Finally, the third row exhibits the RDM’s for HSF stimuli. During the 95–125 ms interval, the within-dissimilarity between the human face and body is lower than other stimuli. Besides, during 175–205 ms, within-dissimilarity for bodies and faces is diminished, and between-dissimilarity increases.

We can observe the separation of face versus inanimate. RDM illustrates the representation patterns for several abstraction categories at various SF. The distinctness and dissimilarity of categories at mid-level abstraction can also be seen by visiting the RDMs at the late-phase responses at both high and low-SFs. Furthermore, increasing category separation based on dissimilarity could be observed by comparing each late-phase RDM with each early phase one at each SF. Nevertheless, at these RDMs, the contrast between categories in high and low-level abstractions is unclear.

Figure 6b refers to the correlation with a model, the Pearson correlation of empirical RDMs with reference RDM related to each hierarchy level for different SF (see “Methods”), and the correlation’s value directly relates to the separability of categories. Although mid-level (face-body) correlation has been maintained in both HSF and LSF, correlation values toward intact are decreased. However, the correlation with the model at the super-ordinate level (animate vs. inanimate) and sub-ordinate-level (face identity) have been preserved in HSF and eliminated in the LSF.

Behavioral evidence supports the effect of SF on hierarchical object representation

To complement the neurophysiological findings, we conducted a psychophysical task to investigate how SF impacts object recognition at different levels of abstraction. Our goal was to examine whether our findings on the role of SF in the hierarchical organization of object categories and their representation in the IT cortex would also manifest in the behavioral responses of human subjects. In the task, participants were presented with a series of SF-filtered images of objects from different categories at varying levels of abstraction. We utilized the same stimulus set as our neurophysiological experiment except for the faces, which are substituted with six new faces of six different identities in both male and female genders (see “Methods”). We used a forced two-choice paradigm, where participants were asked whether the presented stimulus belonged to a specific category or identity (for identities, we used the names assigned to them in the training phase). Each trial consists of the question for 500 ms, 25 ms of stimulus presentation, 500 ms of the mask, 500 ms of the blank page, and the yes/no answer. Participants are asked to answer as accurately and quickly as possible, and their reaction time and answers are collected. Figure 7a shows the task process and added face stimuli.

Figure 7b shows the average accuracy of participants separated by level of abstraction and SF content. Consistent with our findings, the accuracy of participants in the mid-level of abstraction is significantly higher than the other two levels in intact, LSF and HSF stimuli (intact: mid = 0.92 ±0.02 > super-ordinate = 0.89 ± 0.01 (p = 0.03) and >sub-ordinate =0.78± 0.03 (p = 0.001), LSF: mid = 0.84 ± 0.03 >super-ordinate = 0.73 ±0.02 (p = 5e-4) and >sub-ordinate = 0.58 ± 0.02 (p = 8e-5), HSF: mid=0.91 ± 0.02 >super-ordinate = 0.87 ± 0.02 (p = 0.03) and >sub-ordinate = 0.70 ± 0.02 (p = 6e-5)). Considering the super-ordinate level, there is no significant difference in categorization accuracy between intact and HSF (intact = 0.89 ± 0.01, HSF = 0.87 ± 0.02, and p = 0.21), while both are significantly higher than LSF (LSF = 0.73 ± 0.02 < HSF (p = 6e-5) and <intact (p = 5e-5)). Finally, at the sub-ordinate level, both removing of LSF and HSF components significantly degrade the categorization performance (intact = 0.78 ± 0.03 > LSF = 0.58 ± 0.02 (p = 1e-4) and > HSF = 0.70 ± 0.02 (p = 6e-5)). These findings are consistent with our neurophysiological observations where HSF is necessary for the super-ordinate level, while sub-ordinate representation relies on both LSF and HSF.

Reaction times separated for correct and wrong answers are depicted in Fig. 7c (left: correct, right: wrong). Reaction times for the correct answers show the mid-level temporal advantage, where in all SF contents, the reaction time for the mid-level is significantly shorter than the other two levels (intact: mid = 0.39 ± 0.03 <super-ordinate = 0.53 ± 0.04 (p = 2e- 4) and <sub-ordinate = 0.72 ± 0.08 (p = 1e-4), LSF: mid = 0.42 ± 0.04 >super-ordinate = 0.63 ± 0.05 (p = 2e-4) and >sub-ordinate = 0.80 ± 0.09 (p = 6e-5), HSF: mid = 0.45 ± 0.04 > super-ordinate=0.54 ± 0.04 (p = 0.003) and > sub- ordinate = 0.77 ± 0.09 (p = 0.005)). Furthermore, the reaction time for the super-ordinate category is significantly shorter than the sub-ordinate (intact: p = 0.001, LSF: p = 0.009, HSF: p = 0.01). In the reaction time of wrong answers, we see no similar trend. Considering the reaction time in each level of abstraction in both correct and wrong answers, no significant impact of SF is observed on the reaction times except for the super-ordinate category, where the reaction time in the LSF is significantly higher than that of intact and HSF.

To overcome the low within-category variance raised by the limited number of stimuli, we repeated the psychophysical study with a more diverse stimulus set. The new stimulus set consists of 10 categories in total, five animate (bird, mammal, reptile, insect, and fish) and five inanimate (fruit, chair, clock, car, and house) categories in the super-ordinate level. All 10 categories are utilized for mid-level categorization. In total, there exist 150 stimuli (50 in each SF level). Finally, the sub-ordinate level is the same as the previous experiment with 10 new faces (See “Methods and Materials”). The average accuracy and reaction times of the participants are illustrated in Fig. 8. The average accuracy of participants supports our previous findings where the categorization accuracy in the mid-level is significantly higher than the super- and sub-ordinate (intact: mid = 0.95 ± 0.01 >super-ordinate =0.80 ± 0.02 (p = 2 × 10⁻⁵) and >sub-ordinate = 0.87 ± 0.02 (p = 6 × 10⁻⁴), LSF: mid=0.86 ± 0.02 >super-ordinate = 0.70 ± 0.01 (p = 6 × 10⁻⁵) and >sub-ordinate=0.52 0.02 (p = 3 × 10⁻⁵), HSF: mid = 0.94 0.01 > super-ordinate =0.80 ± 0.01 (p = 4 × 10⁻⁵) and >sub-ordinate = 0.68 ± 0.02 (p = 3 × 10⁻⁵)). At the super-ordinate level, there is no significant difference between intact and HSF (p = 0.70), while the accuracy in LSF is significantly lower than intact (p = 3 × 10⁻⁵) and HSF (p = 5 × 10⁻⁵). At the sub-ordinate level, both LSF (p = 2 × 10⁻⁵) and HSF (p = 8 × 10⁻⁵) conditions significantly degraded the categorization accuracy of participants compared to intact stimuli. Therefore, the new experiment provides support for the generalization of our findings. Unlike the categorization accuracy, the reaction times are different from the previous experiment and we observed a significant impact of SF in the reaction time at super- and subordinate levels of abstraction (one-way ANOVA p value for super-ordinate: 0.01, mid-level: 0.05, and sub-ordinate: 0.02). Furthermore, the reaction times are in line with the categorization accuracy, where higher accuracies have lower reaction times.

**Fig. 8: Psychophysical study with a diverse stimulus set.**

Discussion

In this paper, we studied the effect of SF on the visual object category representation in three levels of abstraction by analyzing the neural responses of the IT cortex and STS of the macaque monkeys. To the best of our knowledge, this study is the first attempt to investigate the effect of SF on the hierarchical representation of categories in neuronal space. Our dataset contains mid-level categories (face and body), forming two super-ordinate abstraction levels (animate and inanimate) and a sub-ordinate level (i.e., the identity of faces) (Fig. 1). We found that the mid-level information (i.e., face vs. body) is present in both LSF and HSF. However, the presence of HSF is necessary for representing a super-ordinate level in IT neurons. In addition, the identity information (the sub-ordinate level) was absent in both LSF and HSF contents (Figs. 2, 3). Given the presence of mid-SF information in both HSF and LSF bands, our study suggests the potential importance of these frequency bands in mid-level categorization. However, it is essential to clarify that our work specifically focuses on the analysis of HSF and LSF content, and we do not directly examine mid-SF content in this study. Thus, the information of mid-level categories does not directly depend on HSF or LSF contents. However, sub-ordinate level representation in the IT cortex needed all range of SF content since any SF filtering degraded the identity information. Our results are consistent for the two monkeys (see Supplementary Fig. 2) and recording locations within the IT and STS (see Supplementary Fig. 3). Finally, we verified our observations with two human psychophysics tasks one with the same stimulus set and one with a larger number of stimuli. In the psychophysics tasks, we observed that the performance of the mid-level categorization is significantly higher than that of super- and sub-ordinate levels. The super-ordinate level categorization was only affected by the LSF filtering (removing HSF components), while both LSF and HSF filtering decreased the performance in the sub-ordinate level. Furthermore, employing the reaction times of the correct answers, we observed the mid-level temporal advantage in intact, LSF, and HSF in one task.

Object categorization and its neural correlates in the ventral visual pathway have been widely studied^22,24,31,33. Studies show the selective response of IT cells to specific categories in various levels of abstraction^23,24,34,35. However, the processing order of various abstraction levels is still being debated. The response of IT neurons to human and monkey faces shows that the discrimination between monkey-human happens faster than face identities^25,26. On the one hand, massive psychophysical and individual neuron recording studies show a faster representation of mid-level categories rather than super- or sub-ordinate levels⁴. Furthermore, this claim is challenged by several studies showing the faster perception for super-ordinate level^36,37,38,39.

SF can affect the categorization performance. For example, Rotshtein et al.⁹ suggest that house-flower and face-house categorization are easier in HSF, while flower-face and gender categorizations are easier in LSF. Our results confirmed the studies that magnify the role of HSF contents in super-ordinate categorization^2,20. Nevertheless, Ashtiani et al. showed low-frequency information is sufficient for super-ordinate level²¹. This contradiction could be due to the utilized categories and specific paradigm design, which relies on very fast presentation and block-based experiment⁴. Furthermore, in object recognition tasks, specifically in animal detection in the work of Ashtiani et al.²¹, subjects could rely on different parts of objects for effective categorization, which could be the source of contradiction. Our findings also show that the amount of information loss due to the HSF filtering (removing of LSF components) is the same as the LSF in face categorization at the mid-level; however, HSF filtering (removing of LSF components) of the body stimuli at the mid-level of abstraction preserves the amount of information compared to intact stimuli. This observation supports the evidence that suggests a special neural mechanism for face representation in the IT cortex²¹. The effect of SF filtering on face information is also compatible with psychophysical studies where middle-frequency bands are more critical for face perception than LSF and HSF^40,41,42,43. According to these studies, LSF or HSF filtering degrades the amount of information in a face object, similar to our findings, where both LSF and HSF contents carry less information for the face category than the intact faces (Fig. 4).

Craddock et al. investigated the effect of SF on categorization using EEG recording²⁰. They used two levels of abstraction: (i) a gender classification task as a mid-level categorization and (ii) a living vs. non-living classification task as a super-ordinate categorization task. They found that HSF content removal impairs both mid- and super-ordinate-level categorizations. However, no significant interaction between task and SF has been observed^20,44. Unlike EEG studies, psychophysical studies show the impact of SF on categorization in various hierarchical levels. Our IT-spiking activity study also supports the psychophysical results. This discrepancy could have originated from differences in the recorded signals in EEG and extracellular techniques. EEG signals combine synaptic inputs, neuronal outputs, synchrony, and spatial alignment in neuronal population^45,46,47. More profound knowledge about the mapping between EEG and spiking activity is needed to understand this discrepancy.

There exist several confounding factors that our results are immune to. First, all stimuli (intact or filtered) were corrected in contrast and illumination to eliminate the attribution of basic stimulus characteristics to the results (See “Methods and Materials”). Second, as we move from super-ordinate- to mid- to sub-ordinate levels of abstraction the within-category heterogeneity decreases. Therefore, the effect of SF on the hierarchical representation could not be due to the stimulus diversity or within-level dissimilarity of stimuli. Third, the number of stimuli per category could not be attributed to our observations since the number of stimuli per involved category in each experiment was equalized by random sampling of stimuli without replacement (See “Materials and Methods”). Fourth, both supervised and unsupervised methods were used to confirm the observations. We employed SVM and SI as two supervised methods for investigating the amount of information in neural responses. For unsupervised ones, hierarchical clustering is used. Therefore, observations could not be shaped by the specific characteristics of the data analyzing method. Rolling out the confounds increases the reliability of our results about the effect of SF on the hierarchical representation of categories in the IT cortex.

The recording area is uniformly distributed across the IT cortex from posterior IT to anterior IT. It also includes the superior temporal sulcus. Our analysis is based on the visually responsive neurons across all stimuli (See “Materials and Methods”). Since we have face, body, and inanimate selective neurons (47 face, 82 body, 25 natural, and 36 man-made selective neurons, responses are averaged from 70 ms to 170 ms after stimulus onset), the recording area could include the face patches but is not limited to these patches. Since the exact location of face or body patches needs functional magnetic resonance imaging data, we cannot determine the exact recording locations relative to face or body patches.

To better understand the role of category-specific neural tuning in our findings, we analyzed the distribution of stimulus category preferences among individual neurons: 47 neurons preferred faces, 72 preferred bodies, 25 favored natural objects, and 24 favored man-made objects. This diverse distribution suggests that our population-level results are not solely driven by a preference for face stimuli but rather reflect a broader spectrum of category-specific tuning across neurons. We then focused on face- and body-selective neuron subpopulations to examine how SF filtering influences category encoding at different levels (see Supplementary Fig. 4). In face-selective neurons, we observed a small but significant decoding of animate-inanimate distinctions in the LSF condition, although this effect was weaker than in the intact and HSF conditions. Identity-level decoding in face-selective neurons was only evident in the intact condition, underscoring the necessity for both LSF and HSF information for fine-grained categorization. In contrast, body-selective neurons did not show significant identity decoding across any SF condition. Therefore, while mid-level distinctions (e.g., face vs. body) remained robust across all SF conditions, superordinate categorizations relied more heavily on HSF, and identity information was present only in face-selective neurons, requiring both LSF and HSF for effective encoding.

The number of stimuli per category-SF condition was small because of the simultaneously studying SF and hierarchical representation with the limited number of stimuli. Therefore, we only investigated one category pair per abstraction level. Small stimulus set impact both super- and sub-ordinate levels of abstraction, and their generalization is not as powerful as the mid-level. However, as illustrated in Fig. 3 and the human psychophysics task with a larger number of stimuli, the generalization of super-, mid-, and sub-ordinates categorization is high enough for the analysis. More importantly, while the stimulus set does not support the full spectrum of abstraction levels, we are sure that the abstraction level of mid-categories (face vs. body) is between the super- and sub-ordinate levels. Furthermore, the gender characteristics of the two identity classes are also different. However, if the discrimination is based on gender, it is still finer than the face vs body categorization. From the SF point of view, only two SF bands exist, i.e., LSF and HSF, and the middle SF band was absent in our stimulus set. A more balanced stimulus set with three levels of SF filtering is needed to fully understand the effect of SF filtering and the importance of each SF band on the hierarchical representation of categories in the ventral visual pathway.

One important consideration in our study is the use of pixel-based filtering for SF manipulations. While this approach ensures uniformity across stimuli, it introduces certain limitations that must be acknowledged. Specifically, because our filtering operates in pixel units, the real-world size of an object in the image (e.g., a face versus a body) is not directly taken into account. This means that objects of different physical sizes are treated equally in terms of spatial frequency content, potentially leading to variations in the level of detail preserved across stimuli, especially when comparing objects at different levels of abstraction. For instance, an LSF image of a body might resemble what a monkey would see at a distance of 50 m, while a low-spatial-frequency image of a face might preserve more details, similar to viewing a face from a shorter distance. This discrepancy arises because the filtering is applied in pixel units, not real-world distances such as millimeters or degrees of visual angle. While this is a common practice in visual neuroscience studies, it may result in conclusions that are influenced by the scale of the photographs used, rather than solely by the spatial frequency properties of the objects themselves. We note that our power spectrum analysis indicates that the overall distribution of SF components remains consistent across object categories, regardless of their real-world size.

In light of our findings, the broader implications of this study extend beyond the specifics of SF effects on category representation in the IT cortex. Firstly, our research offers valuable insights into the fundamental mechanisms of visual perception and cognition. Understanding how different levels of shape detail (coarse vs. fine) are processed in the brain can significantly advance our comprehension of visual processing, which has implications for various fields such as cognitive neuroscience and psychology. Moreover, the distinct roles of HSF and LSF in category representation can inform the development of more brain-like models for visual recognition in artificial intelligence and machine learning, potentially leading to improvements in technologies like facial recognition systems and automated image categorization.

In summary, we investigated the effect of SF on hierarchical object representation in the macaque IT cortex and found that super-ordinate representation is highly dependent on the HSF band rather than the LSF band. On the other hand, sub-ordinate categories need all SF contents. These findings suggest that shape boundaries are enough for coarse categories to be represented in the IT cortex, while the IT cortex needs all fine and coarse shape information to represent finer categories. The dependence of categorization on SF provides a mechanism to use various SF bands in hierarchical category perception and behavior.

Methods

Animals and recordings

We analyzed the responses of neurons in the IT cortex and STS of two male macaque monkeys (10 and 11 kg and 11 and 12 years old). All experimental procedures followed the National Institutes of Health Guide for the Care and Use of Laboratory Animals and the Society for Neuroscience Guidelines and Policies. The Institute for Research in Fundamental Sciences committee approved the protocols for both monkeys’ experimental, surgical, and behavioral procedures. We have complied with all relevant ethical regulations for animal use. To place a recording chamber in a subsequent surgery, magnetic resonance imaging and CT scans were performed to identify the prelunate gyrus and arcuate sulcus. Under strict aseptic conditions and Isoflurane anesthesia, all surgical procedures were performed. A custom-made stainless-steel chamber was implanted into each animal before behavioral training. Titanium screws and dental acrylics were used to attach the chamber to the skull. A craniotomy was performed for both monkeys within the 30 × 70 mm chamber (5 mm to 30 mm A P⁻¹ and 0 mm to 23 mm M L⁻¹).

During the experiment, animals were seated in custom-made primate chairs, with their heads restrained and a tube delivering juice rewards inserted into their mouths. Eye position was monitored and stored at 2 kHz using an infrared optical eye tracking system (EyeLink 1000 Plus Eye Tracker, SR Research Ltd, Ottawa, CA). It was mounted in front of the monkey, and the EyeLink PM captured eye movements-910 Illuminator Module and EyeLink 1000 Plus Camera (SR Research Ltd, Ottawa, CA). Custom software is written in MATLAB using the MonkeyLogic toolbox-controlled stimulus presentation and juice delivery. We presented visual stimuli to the animal on a 24-in LED-lit monitor (AsusVG248QE: 1920 × 1080, 144 Hz) set at 65.5 cm from its eyes. The actual time the stimulus appeared on the monitor was recorded using a photodiode (OSRAM Opto Semiconductors, Sunnyvale, CA).

An electrode, securely attached to a recording chamber, was positioned within the craniotomy area using the Narishige two-axis platform, facilitating continuous adjustment of electrode positioning. To establish contact with, or minimally penetrate, the dura, a 28-gauge guide tube was introduced via a manual oil hydraulic micromanipulator from Narishige, Tokyo, Japan. For extracellular recording of neural activity in both monkeys, varnish-coated tungsten microelectrodes (FHC, Bowdoinham, ME) with an impedance between 0.2 and 1 MΩ (measured at 1 kHz) and a shank diameter ranging from 200 to 250 µm were inserted into the brain. Single-electrode recordings were conducted using a pre-amplifier and amplifier (Resana, Tehran, Iran), with filtering parameters set between 300 Hz and 5 kHz for spikes and 0.1 Hz and 9 kHz for local field potentials. Continuous data were digitized and stored at a sampling rate of 40 kHz for subsequent offline spike sorting and data analysis. Identification of IT was based on patterns of gray and white matter, its stereotaxic location, position relative to nearby sulci, and response properties of encountered units.

Stimulus set and task paradigm

The stimulus set consisted of 81 grayscale photographs of various objects in three SFs (HSF:27, LSF:27, intact:27) centered on a gray background.The images used in the study had a size of 500 × 500 pixels (5° × 5°). Specifically, human faces were represented with a width ranging from 330 to 350 pixels (3.3°–3.5°) and a height of 500 pixels (5°). Human body images had a width ranging from 160 to 200 pixels (1.6°–2°) and a height of 500 pixels (5°). Animal body and face images were standardized to 500 × 500 pixels (5° × 5°), except for one animal face with a height of 300 pixels (3°). Additionally, other images in the set had a minimum size of 175 pixels (1.75°) in width and 200 pixels (2°) in height. Images were displayed at the center of a monitor and were scaled to fit in a 5° window.

There exist two identities (one male and one female) and three faces per identity that form the sub-ordinate categories. At mid-level, nine face stimuli exist (six humans with two mentioned identities and three animal faces), six bodies (three humans and three animal bodies), and six natural and six man-made stimuli. At the super-ordinated level, face and body categories form the animate category, and natural and man-made categories are combined to form the inanimate category. To present a stimulus set that could be reliably recorded from each neuron, we utilized a rapid RSVP. Each session of the recording consists of 5 blocks. In each block, we show all stimuli in a pseudo-random order. The stimulus duration and interstimulus intervals were 50 ms and 450 ms, respectively. The monkeys were required to maintain fixation within a window of 2° at the center of the screen. They were rewarded with juice in each 1.5 to 2 s for keeping focus.

Human psychophysics

We designed two psychophysics tasks to evaluate our neurophysiological findings. The first task utilizes the same stimulus set as our neurophysiological experiment except for the face to verify our findings in human perception. We replaced six face stimuli with six new faces of different identities and assigned a name to each of the six identities (three males and three females). We also created a mask version of each stimulus by scrambling the pixels of each stimulus. The responses from 21 human subjects (12 males and nine females) were collected. All participants signed a consent form at the beginning of the experiment, and the study protocol was approved by the Institute for Research in Fundamental Sciences committee. All ethical regulations relevant to human research participants were followed. The task consists of a training phase for identities, followed by the main phase. In the training phase, the participant observes each face stimulus in intact form for an arbitrary time to learn the name assigned to each face. The main phase starts right after training and consists of a forced two-choice (yes/no) categorization task. Each trial starts with the question for 500 ms, followed by the stimulus for 25 ms, then the mask presentation for 500 ms, 500 ms of a blank page, and finally, the answer (yes/no) appears on the screen. The trial ends when the participant presses the right (for yes) or left (for no) keys. There exist various questions based on the category. For natural and man-made categories, the question is where the stimulus belongs to animate or inanimate categories (super-ordinate level). For the body, in addition to the super-ordinate level question, the belonging of the stimulus to the face or body category is also questioned (mid-level). Finally, for face stimuli, in addition to the two aforementioned question types, we have a question about the identity of the face. The second experiment uses the same protocol as the first experiment, with more diverse categories. The second experiment consisted of 10 categories organized into five animate (bird, mammal, reptile, insect, and fish) and five inanimate (fruit, chair, clock, car, and house) categories. In the super-ordinate and mid-level, there are 20 (20 in each SF content) stimuli uniformly distributed in categories. In the sub-ordinate, we included the faces of 10 individuals (five males and five females, different from the first task). The rest of the details are as in the first task.

Spatial frequency filtering of stimulus

Each stimulus has three versions of intact, HSF, and LSF regarding SF. A band-pass 2D Butterworth filter is designed to construct each stimulus by multiplying a high-pass Butterworth filter with a low-pass one. The high-pass and low-pass filters are constructed in the frequency domain using the following formulas.

$$\begin{array}{cc}{H}_{H}\left(u,v\right)=\frac{1}{1+(\frac{\sqrt{{u}^{2}+{v}^{2}}}{{f}_{c}})2* d} & {H}_{L}\left(u,v\right)=\frac{1}{1+(\frac{{f}_{c}}{\sqrt{{u}^{2}+{v}^{2}}})2* d}\end{array}$$

(1)

$$H={H}_{H}\times {H}_{L}$$

(2)

where H_H and H_L are high- and low-pass filters in the frequency domain, u and v are frequency indices, f_c is the cut-off frequency, d is the filter order, and H is the final band-pass filter employed to construct the stimuli set. For LSF images, the cut-off frequency of low-pass and high-pass filters are 1 and 10 cycles image⁻¹, respectively. To construct HSF images, the cut-off frequencies of 18 and 75 cycles image⁻¹ are used for low- and high-pass filters. To equalize the luminance value across the stimuli, each image’s average gray level of pixels is shifted to the middle of the range as follows.

$$I\left(i,j\right)=I\left(i,j\right)-\frac{1}{{N}_{I}}{\sum}_{i,j}I\left(i,j\right)+128$$

(3)

where I(i, j) is the gray level of the pixel located at the i’th row and the j’th column and N_I is the total number of pixels in the image. To equalize the contrast, all image pixels are standardized by the STD of all pixels in that image and multiplied by a fixed factor as follows.

$$I\left(i,j\right)=\frac{I(i,j)}{{\sigma }_{I}}\times 8$$

(4)

where σ_I is the STD of all pixels in the image.

The temporal dynamic calculation for category information in the IT population

Responses of 379 neurons (261 Monkey 1 and 118) were recorded as the monkeys viewed a rapid presentation of different natural and artificial visual stimuli. Then, the spiking activities are extracted employing the ROSS toolbox⁴⁸. At each time point, we represented each stimulus by a vector whose elements are the average firing rates of the recorded single neurons. For each time point and a given neuron, the average firing rate of that neuron in a 50 ms window around the time point is calculated. Therefore, each stimulus could be represented with a point in R^N space in each time point, where N is the number of recorded neurons. So, each stimulus was represented in the population of N neurons.

$${S}_{i}(t)=[{r}_{1}(t),{r}_{2}(t),\ldots ,{r}_{N}(t)]$$

(5)

where S_i(t) is the stimulus representation in time t and r_n(t) is the average response of the n’th neuron in a 50 ms window around t.

The advantages of population representation were studied in many theoretical and experimental works^49,50,51,52 where the signal correlation in the population of neural data increases coding performance for object discrimination. We normalized each neuron’s responses using the z-score procedure by subtracting the mean and dividing by the standard deviation across trials. Furthermore, we only included the responsive neurons in the analysis to achieve reliable results. A neuron is responsive if its firing rate from 70 to 170 ms after stimulus onset is significantly greater than that from 50 ms before to 50 ms after stimulus onset across all stimuli. We used a non-parametric two-tailed Wilcoxon signed-rank test with a significance level of 0.05 (false discovery rate corrected with Benjamini/Hochberg method⁵³) to find responsive neurons. We found 330 responsive neurons (45 were recorded from the STS) in total (223 Monkey 1 and 107 Monkey 2). Throughout the experiments, all population analyses are conducted on all 330 responsive neurons.

Low dimensional representation

We embedded data into low dimensional space with linear and nonlinear approaches. Principal component analysis (PCA) was applied as a linear method to illustrate the separation of IT responses for different categories in two dimensions space. PCA utilizes the eigenvectors of the covariance matrix of the samples to transform the data from the high-dimensional to the lower-dimensional neural space. We calculated the principal components for neural responses for two early and late intervals at 80 ms to 110 ms and 155 ms to 185 ms after stimulus onset. Then, the first two components corresponding to the highest eigenvalues are considered a 2D representation of the high-dimensional neural responses. The PCA algorithm is applied to all stimuli simultaneously; thus, PC dimensions are the same for different category comparisons. The explained variance of the first two dimensions is 38%.

Category information using separability index and classification accuracy

The discrimination of two categories (e.g., face and body) according to the population responses is adapted from our previous work⁴. SI is defined based on the scatter matrix within and between categories samples. The ratio of the norm of the between-category and within-category scatter matrices was defined as SI. Here, we used the Frobenius norm for scatter matrices. Employing SI as a category separation measure in neuronal populations has several advantages. It could be used for high dimensional data (330 responsive neurons in this study). It takes both the variance and covariance of categories into account. It also could be employed for multi-class scenarios. This metric is computed for the high and low dimensional neural responses using a 50 ms sliding time window with a 1 ms stride. Then it is smoothed employing a Gaussian window with σ = 10 ms. An SVM classifier with a linear kernel⁵⁴ is trained using the same sliding windows as in SI for classification accuracy. Bootstrap sampling is used for the evaluation of the SVM classifier. Training samples are selected with replacement, and the remaining is considered as the test samples. The statistical analysis of both SI and SVM are carried out empirically, as stated in the next section. In both SI and SVM, when the number of classes is not equal, in each run, the samples of the larger classes are sampled randomly to equalize the number of samples per class.

Statistics and reproducibility

Unless stated otherwise, all statistical analysis is based on the bootstrap method described here. To calculate standard deviation, confidence intervals, and p values, a bootstrapping process⁵⁵ is employed. The confidence intervals are calculated empirically using quantiles across the bootstrap runs. Therefore, the 95% confidence interval starts from 0.025 quantiles to 0.975 quantiles. All the calculations were repeated 1000 times on a random selection of stimuli with a bootstrap method. Then p values are calculated based on the confidence interval of a given index. The p value for a given index for a specific category is calculated as $\frac{r}{n}$, where r is the number of values lower (higher) than zero (or other value if stated), and n is the total number of runs. To compare an index for two categories, first, the index values are subtracted and again compared with zero, as mentioned before.

Neuron response modulation index and SF modulation index

To measure the effect of SF on neuron responses and information, we defined the RMI and SMI, which compare the amount of information loss (neuron response change) caused by SF filtering. To calculate SMI (similarly RMI) in each LSF or HSF, the SI difference (similarly firing rate difference for RMI) between intact and LSF or HSF is measured and normalized to the sum of corresponding SI values. Mathematically, SMI_LSF and SMI_HSF are defined as follows (the same method is applied for the RMI in Fig. 2a).

$$\begin{array}{cc}{{{{\rm{SMI}}}}}_{{{{\rm{LSF}}}}}=\frac{{{{{\rm{SI}}}}}_{{{{\rm{Intact}}}}}-{{{{\rm{SI}}}}}_{{{{\rm{LSF}}}}}}{{{{{\rm{SI}}}}}_{{{{\rm{Intact}}}}}+{{{{\rm{SI}}}}}_{{{{\rm{LSF}}}}}} & {{{{\rm{SMI}}}}}_{{{{\rm{HSF}}}}}=\frac{{{{{\rm{SI}}}}}_{{{{\rm{Intact}}}}}-{{{{\rm{SI}}}}}_{{{{\rm{HSF}}}}}}{{{{{\rm{SI}}}}}_{{{{\rm{Intact}}}}}+{{{{\rm{SI}}}}}_{{{{\rm{HSF}}}}}}\end{array}$$

(6)

Higher values of SMI for LSF (HSF) given specific category information, e.g., face vs. body, shows the information loss in the LSF (HSF) band and, equivalently, the importance of HSF (LSF) for discrimination between face and body. On the other hand, a lower absolute value for SMI shows that SF does not affect the information significantly.

Clustering

We applied the hierarchical clustering method in the early (80–110 ms after stimulus onset), late (155–185 ms after stimulus onset) phases, and in the interval of 70–170 ms after stimulus onset in each SF band. Here we employ agglomerative hierarchical clustering to compute the tree structure⁵⁶. The advantage of this method is that it is an unsupervised analysis with no prior assumption on data representations. The likeness score between the category and the tree is estimated by applying the average of the two ratios²⁴:

$$\begin{array}{cc}{r}_{1}=\frac{{number\; of\; category\; members\; under\; the}\,\, n{ode}}{{total\; members\; of\; the \,category}}, & \\ {r}_{2}=\frac{{number\; of\; category\; members\; under\; the}\,\, n{ode}}{{total\; stimuli\; under\; the\; node}}\end{array}$$

The clustering score is equivalent to the average of r₁ and r₂.

Representational dissimilarity matrices

The RDM is computed by matching all pairwise combinations of stimuli. The distance among the activation patterns is calculated as 1 − r, where r is the Pearson correlation coefficient⁵⁷. All RDM’s are ranked-Normalized between 0 and 100.

Correlation with the model

The correlation of empirical RDMs has been calculated per time point with three theoretical models; a stimulus animacy model, a model that separates face versus body stimuli, and a model based on each face identity⁵⁸. These models have predicted the relevant dissimilarity of IT activation patterns for each stimulus couple56. The correlation between the model and empirical RDMs reveals the order in which the “representational structure” defined by each model is in the IT activation patterns⁵⁸. Empirical RDM is tested and examined for its ability to explain the reference RDM for each category. We define the theoretical RDM Models for each abstraction level (animate-inanimate, face-body, and Identity) as a matrix in which individuals in between-category pairs are one, and within-category couples are zero. Then the correlation with a model is determined by the correlation between the reference RDM and the theoretical RDM models for each category. Correlation with a model is computed for RDM’s in each time step within a 50 ms window^31,58.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The data that support the findings of this study are available from the Institute for Research in Fundamental Sciences (IPM), but restrictions apply to the availability of these data, which were used under license for the current study and so are not publicly available. The data are, however, available from the corresponding author on reasonable request. Furthermore, the underlying numerical data for all bar graphs presented in the figures of this article are available in Supplementary Data 1, which contains the exact values used to generate the corresponding bar graphs.

Code availability

The code used in this study is available from the corresponding author upon reasonable request. Custom software is written in MATLAB (2015a) using the MonkeyLogic toolbox to control stimulus presentation and juice delivery for both monkeys. For spike sorting, we used ROSS open-source software (MATLAB version)⁴⁸, available at https://github.com/ramintoosi/ROSS. Additional analyses were conducted using custom scripts written in MATLAB R2021b and Python.

References

Tanaka, J. W. & Taylor, M. Object categories and expertise: Is the basic level in the eye of the beholder? Cogn. Psychol. 23, 457–482 (1991).
Article Google Scholar
Collin, C. A. & McMullen, P. A. Subordinate-level categorization relies on high spatial frequencies to a greater degree than basic-level categorization. Percept. Psychophys. 67, 354–364 (2005).
Article PubMed Google Scholar
Rogers, T. T. & Patterson, K. Object categorization: reversals and explanations of the basic-level advantage. J. Exp. Psychol. Gen. 136, 451 (2007).
Article PubMed Google Scholar
Dehaqani, M.-R. A. et al. Temporal dynamics of visual category representation in the macaque inferior temporal cortex. J. Neurophysiol. 116, 587–601 (2016).
Article PubMed PubMed Central Google Scholar
Schyns, P. G. & Oliva, A. From blobs to boundary edges: Evidence for time-and spatial-scale-dependent scene recognition. Psychol. Sci. 5, 195–200 (1994).
Article Google Scholar
Macé, M., Joubert, O. & Fabre Thorpe, M. Entry level at the superordinate level in visual categorization. In 9th International Conference on Cognitive and Neural systems, Vol. 52, 2007–2018 (ICCNS, Boston, MA, 2005).
Kauffmann, L., Bourgin, J., Guyader, N. & Peyrin, C. The neural bases of the semantic interference of spatial frequency-based information in scenes. J. Cogn. Neurosci. 27, 2394–2405 (2015).
Article PubMed Google Scholar
Kauffmann, L., Ramanoël, S. & Peyrin, C. The neural bases of spatial frequency processing during scene perception. Front. Integr. Neurosci. 8, 37 (2014).
Article PubMed PubMed Central Google Scholar
Rotshtein, P., Schofield, A., Funes, M. J. & Humphreys, G. W. Effects of spatial frequency bands on perceptual decision: It is not the stimuli but the comparison. J. Vis. 10, 25–25 (2010).
Article PubMed Google Scholar
He, C., Hung, S.-C. & Cheung, O. S. Roles of category, shape, and spatial frequency in shaping animal and tool selectivity in the occipitotemporal cortex. J. Neurosci. 40, 5644–5657 (2020).
Article CAS PubMed PubMed Central Google Scholar
Toosi, R. et al. The spatial frequency representation predicts category coding in the inferior temporal cortex. https://doi.org/10.7554/elife.93589 (2024).
Cheung, O. S. & Bar, M. The resilience of object predictions: early recognition across viewpoints and exemplars. Psychon. Bull. Rev. 21, 682–688 (2014).
Article PubMed PubMed Central Google Scholar
Kihara, K. & Takeda, Y. Time course of the integration of spatial frequency-based information in natural scenes. Vis. Res. 50, 2158–2162 (2010).
Article PubMed Google Scholar
Caplette, L., West, G., Gomot, M., Gosselin, F. & Wicker, B. Affective and contextual values modulate spatial frequency use in object recognition. Front. Psychol. 5, 512 (2014).
Article PubMed PubMed Central Google Scholar
Fiorentini, A., Maffei, L. & Sandini, G. The role of high spatial frequencies in face perception. Perception 12, 195–201 (1983).
Article CAS PubMed Google Scholar
Hayes, T., Morrone, M. C. & Burr, D. C. Recognition of positive and negative bandpass-filtered images. Perception 15, 595–602 (1986).
Article CAS PubMed Google Scholar
Näsänen, R. Spatial frequency bandwidth used in the recognition of facial images. Vis. Res. 39, 3824–3833 (1999).
Article PubMed Google Scholar
Goffaux, V. & Rossion, B. Faces are” spatial”–holistic face perception is supported by low spatial frequencies. J. Exp. Psychol. Hum. Percept. Perform. 32, 1023 (2006).
Article PubMed Google Scholar
Collin, C. A., Therrien, M., Martin, C. & Rainville, S. Spatial frequency thresholds for face recognition when comparison faces are filtered and unfiltered. Percept. Psychophys. 68, 879–889 (2006).
Article PubMed Google Scholar
Craddock, M., Martinovic, J. & Müller, M. M. Task and spatial frequency modulations of object processing: an EEG study. PLoS ONE 8, e70293 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ashtiani, M. N., Kheradpisheh, S. R., Masquelier, T. & Ganjtabesh, M. Object categorization in finer levels relies more on higher spatial frequencies and takes longer. Front. Psychol. 8, 1261 (2017).
Article PubMed PubMed Central Google Scholar
Bruce, C., Desimone, R. & Gross, C. G. Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque. J. Neurophysiol. 46, 369–384 (1981).
Article CAS PubMed Google Scholar
Fujita, I., Tanaka, K., Ito, M. & Cheng, K. Columns for visual features of objects in monkey inferotemporal cortex. Nature 360, 343–346 (1992).
Article CAS PubMed Google Scholar
Kiani, R., Esteky, H., Mirpour, K. & Tanaka, K. Object category structure in response patterns of neuronal population in monkey inferior temporal cortex. J. Neurophysiol. 97, 4296–4309 (2007).
Article PubMed Google Scholar
Matsumoto, N., Okada, M., Sugase-Miyamoto, Y., Yamane, S. & Kawano, K. Population dynamics of face-responsive neurons in the inferior temporal cortex. Cereb. Cortex 15, 1103–1112 (2005).
Article PubMed Google Scholar
Sugase, Y., Yamane, S., Ueno, S. & Kawano, K. Global and fine information coded by single neurons in the temporal visual cortex. Nature 400, 869–873 (1999).
Article CAS PubMed Google Scholar
Edwards, R., Xiao, D., Keysers, C., Foldiak, P. & Perrett, D. Color sensitivity of cells responsive to complex stimuli in the temporal cortex. J. Neurophysiol. 90, 1245–1256 (2003).
Article PubMed Google Scholar
Földiák, P., Xiao, D., Keysers, C., Edwards, R. & Perrett, D. I. Rapid serial visual presentation for the determination of n eural selectivity in area stsa. Prog. Brain Res. 144, 107–116 (2004).
Article PubMed Google Scholar
Keysers, C., Xiao, D.-K., Földiák, P. & Perrett, D. I. The speed of sight. J. Cogn. Neurosci. 13, 90–101 (2001).
Article CAS PubMed Google Scholar
Chao, L. L., Haxby, J. V. & Martin, A. Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nat. Neurosci. 2, 913–919 (1999).
Article CAS PubMed Google Scholar
Kriegeskorte, N. et al. Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron 60, 1126–1141 (2008).
Article CAS PubMed PubMed Central Google Scholar
Martin, A., Wiggs, C. L., Ungerleider, L. G. & Haxby, J. V. Neural correlates of category-specific knowledge. Nature 379, 649–652 (1996).
Article CAS PubMed Google Scholar
Tanaka, K. Columns for complex visual object features in the inferotemporal cortex: clustering of cells with similar but slightly different stimulus selectivities. Cereb. Cortex 13, 90–99 (2003).
Article PubMed Google Scholar
Desimone, R., Albright, T. D., Gross, C. G. & Bruce, C. Stimulus-selective properties of inferior temporal neurons in the macaque. J. Neurosci. 4, 2051–2062 (1984).
Article CAS PubMed PubMed Central Google Scholar
Thorpe, S., Fize, D. & Marlot, C. Speed of processing in the human visual system. Nature 381, 520–522 (1996).
Article CAS PubMed Google Scholar
Fabre-Thorpe, M., Delorme, A., Marlot, C. & Thorpe, S. A limit to the speed of processing in ultra-rapid visual categorization of novel natural scenes. J. Cogn. Neurosci. 13, 171–180 (2001).
Article CAS PubMed Google Scholar
Macé, M. J.-M., Joubert, O. R., Nespoulous, J.-L. & Fabre-Thorpe, M. The time-course of visual categorizations: you spot the animal faster than the bird. PloS ONE 4, e5927 (2009).
Article PubMed PubMed Central Google Scholar
Poncet, M. & Fabre-Thorpe, M. Stimulus duration and diversity do not reverse the advantage for superordinate-level representations: the animal is seen before the bird. Eur. J. Neurosci. 39, 1508–1516 (2014).
Article PubMed Google Scholar
Wu, C.-T., Crouzet, S. M., Thorpe, S. J. & Fabre-Thorpe, M. At 120 msec you can spot the animal but you don’t yet know it’s a dog. J. Cogn. Neurosci. 27, 141–149 (2015).
Article PubMed Google Scholar
Boutet, I., Collin, C. & Faubert, J. Configural face encoding and spatial frequency information. Percept. Psychophys. 65, 1078–1093 (2003).
Article PubMed Google Scholar
Willenbockel, V. et al. Does face inversion change spatial frequency tuning? J. Exp. Psychol. Hum. Percept. Perform. 36, 122 (2010).
Article PubMed Google Scholar
Tardif, J. et al. Culture shapes spatial frequency tuning for face identification. J. Exp. Psychol. Hum. Percept. Perform. 43, 294 (2017).
Article PubMed Google Scholar
Estéphan, A. et al. Time course of cultural differences in spatial frequency use for face identification. Sci. Rep. 8, 1–14 (2018).
Article Google Scholar
Craddock, M., Martinovic, J. & Müller, M. M. Early and late effects of objecthood and spatial frequency on event-related potentials and gamma band activity. BMC Neurosci. 16, 1–12 (2015).
Article Google Scholar
Ikeda, H., Leyba, L., Bartolo, A., Wang, Y. & Okada, Y. C. Synchronized spikes of thalamocortical axonal terminals and cortical neurons are detectable outside the pig brain with meg. J. Neurophysiol. 87, 626–630 (2002).
Article PubMed Google Scholar
Murakami, S. & Okada, Y. Contributions of principal neocortical neurons to magnetoencephalography and electroen- cephalography signals. J. Physiol. 575, 925–936 (2006).
Article CAS PubMed PubMed Central Google Scholar
Okada, Y. C., Wu, J. & Kyuhou, S. Genesis of meg signals in a mammalian cns structure. Electroencephalogr. Clin. Neurophysiol. 103, 474–485 (1997).
Article CAS PubMed Google Scholar
Toosi, R., Akhaee, M. A. & Dehaqani, M.-R. A. An automatic spike sorting algorithm based on adaptive spike detection and a mixture of skew-t distributions. Sci. Rep. 11, https://doi.org/10.1038/s41598-021-93088-w (2021).
Averbeck, B. B., Latham, P. E. & Pouget, A. Neural correlations, population coding and computation. Nat. Rev. Neurosci. 7, 358–366 (2006).
Article CAS PubMed Google Scholar
Adibi, M., McDonald, J. S., Clifford, C. W. & Arabzadeh, E. Population decoding in rat barrel cortex: optimizing the linear readout of correlated population responses. PLoS Comput. Biol. 10, e1003415 (2014).
Article PubMed PubMed Central Google Scholar
Abbott, L. F. & Dayan, P. The effect of correlated variability on the accuracy of a population code. Neural Comput. 11, 91–101 (1999).
Article CAS PubMed Google Scholar
Dehaqani, M.-R. A., Vahabie, A.-H., Parsa, M., Noudoost, B. & Soltani, A. Selective changes in noise correlations contribute to an enhanced representation of saccadic targets in prefrontal neuronal ensembles. Cereb. Cortex 28, 3046–3063 (2018).
Article PubMed PubMed Central Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodol.) 57, 289–300 (1995).
Article Google Scholar
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Article Google Scholar
Efron, B. & Tibshirani, R. J. An Introduction to the Bootstrap (CRC Press, 1994).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Walther, A. et al. Reliability of dissimilarity measures for multi-voxel pattern analysis. Neuroimage 137, 188–200 (2016).
Article PubMed Google Scholar
Grootswagers, T., Robinson, A. K. & Carlson, T. A. The representational dynamics of visual objects in rapid serial visual processing streams. NeuroImage 188, 668–679 (2019).
Article PubMed Google Scholar

Download references

Acknowledgements

This work was supported, in part, by a Silk Road Agreement grant from the Iranian National Science Foundation (INSF), grant number 97002234.

Author information

These authors contributed equally: Esmaeil Farhang, Ramin Toosi.

Authors and Affiliations

School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
Esmaeil Farhang, Ramin Toosi, Narges Kheirkhah, Abdol-Hossein Vahabie & Mohammad-Reza A. Dehaqani
School of Cognitive Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran
Ramin Toosi, Behnam Karami, Roxana Koushki, Jalaledin Noroozi, Ehsan Rezayat, Abdol-Hossein Vahabie & Mohammad-Reza A. Dehaqani
National Institutes of Health (NIH), Bethesda, MD, USA
Behnam Karami
Donders Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, The Netherlands
Behnam Karami
Department of Brain and Cognitive Sciences, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran
Farideh Shakerian & Mohammad-Reza A. Dehaqani
Department of Cognitive Sciences, Faculty of Psychology and Education, University of Tehran, Tehran, Iran
Ehsan Rezayat & Abdol-Hossein Vahabie

Authors

Esmaeil Farhang
View author publications
Search author on:PubMed Google Scholar
Ramin Toosi
View author publications
Search author on:PubMed Google Scholar
Behnam Karami
View author publications
Search author on:PubMed Google Scholar
Roxana Koushki
View author publications
Search author on:PubMed Google Scholar
Narges Kheirkhah
View author publications
Search author on:PubMed Google Scholar
Farideh Shakerian
View author publications
Search author on:PubMed Google Scholar
Jalaledin Noroozi
View author publications
Search author on:PubMed Google Scholar
Ehsan Rezayat
View author publications
Search author on:PubMed Google Scholar
Abdol-Hossein Vahabie
View author publications
Search author on:PubMed Google Scholar
Mohammad-Reza A. Dehaqani
View author publications
Search author on:PubMed Google Scholar

Contributions

E.F. (Esmaeil Farhang), R.T. (Ramin Toosi), B.K. (Behnam Karami), N.Kh. (Narges Kheirkhah), and J.N. (Jalaledin Noroozi) contributed to the development of the methodology. Data collection was carried out by J.N., B.K., R.K. (Roxana Koushki), N.Kh., and E.R. (Ehsan Rezayat). Data analysis was performed by R.T. and E.F. Software development was conducted by R.T., E.F., E.R., J.N., and N.Kh. The original draft of the manuscript was written by R.T. and E.F. Manuscript review and editing were performed by M.-R.A.D. (Mohammad-Reza A. Dehaqani), E.F., R.T., F.S. (Farideh Shakerian), and J.N. Conceptualization of the project was led by F.S., E.R., A.-H.V. (Abdol-Hossein Vahabie), and M.-R.A.D., who also supervised the overall project.

Corresponding author

Correspondence to Mohammad-Reza A. Dehaqani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks Ryan Morrill and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Jacqueline Gottlieb and Joao Valente.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Materials

Supplementary Data 1

Reporting Summary

LaTeX Supplementary File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Farhang, E., Toosi, R., Karami, B. et al. The impact of spatial frequency on hierarchical category representation in macaque temporal cortex. Commun Biol 8, 801 (2025). https://doi.org/10.1038/s42003-025-08230-5

Download citation

Received: 16 May 2023
Accepted: 14 May 2025
Published: 25 May 2025
DOI: https://doi.org/10.1038/s42003-025-08230-5