Human and mouse brains differ in a number of aspects. The most obvious one is their size. The human brain is characterized by its folded cerebral cortex, of which the area is estimated1,2,3 to be 1500–2000 cm2. Moreover, the human cerebral cortex is typically 1.5–3.5 mm thick3,4,5 and consists of 10–20 billion neurons6,7,8. Hence, the mean areal density of neurons in the human cerebral cortex is estimated to be approximately 107 neurons/cm2. In contrast, mice have tiny brains with a smooth cerebral cortex having a thickness9,10 of typically 1 mm and an area of approximately 2 cm2. The mean areal density of neurons in the mouse cerebral cortex11 is reported to be 9.3 × 106 neurons/cm2. This value nearly coincides with that of the human brain, though the cortical thickness differs by a factor of 2 to 3 between human and mouse.

Since the brain is composed of cells, these quantitative and spatial differences are ascribable to the cellular organization of the brain and should have consequently resulted in different connectivity patterns in the neuronal network. Microscopic comparison of the human and mouse neuronal networks can, therefore, provide clues on how to artificially reproduce their functional differences. Comparative studies on the microscopic or cellular constitution of human and mouse brains have been reported from a wide variety of perspectives and have revealed similarities and differences between them12,13,14,15,16,17,18. However, it is not fully understood how those cellular characteristics are translated into functional differences of the brain between humans and mice.

We have recently reported nanometer-scale three-dimensional studies of human cerebral tissues of the anterior cingulate cortex and the superior temporal gyrus of schizophrenia and control cases19,20,21. The results indicated that (1) the neurites of schizophrenia cases are significantly thinner and more tortuous than those of controls and that (2) the neurite curvature in the anterior cingulate cortex showed a correlation with the auditory hallucination score21. Since hallucinations have been reported to be associated with the macroscopic structure and connectivity of the anterior cingulate cortex22,23,24, micro-structural differences of neurons observed in this area should be relevant to mental function. This suggests that the structural alteration of neurons in schizophrenia may have functional influences in terms of information processing in biological systems.

Findings from studies on real neuronal networks have been using in designing artificial neural networks25. The most successful example is the convolutional layer26,27 that was inspired from studies on the visual cortex of mammal brains28. In particular, the efficiency of image recognition was remarkably improved by incorporating the architecture of the visual cortex into the AI26,27. However, functions specific to humans cannot be reproduced from analyses of commonalities between humans and other animals. Neuronal networks in brain areas other than the sensorimotor cortex should be investigated in order to find the network characteristics responsible for higher brain functions. We recently reported an implementation of the findings on the anterior cingulate cortex of schizophrenia cases into an image-recognition AI29. The resultant schizophrenia-mimicking AI outperformed the standard AI under certain conditions, indicating that the neuronal alteration may have a functional contribution to brain performance. This finding suggests a possible linkage between neuronal characteristics and their computational roles.

In this study, we analyzed the three-dimensional structures of brain tissues of the mouse medial prefrontal cortex by using the synchrotron radiation X-ray nanotomography (nano-CT). Three-dimensional images of brain tissues of nine mice were traced in order to build Cartesian-coordinate models of their neuronal networks. The obtained models were used to evaluate structural parameters including the shape of neuronal somata and the spatial trajectory of neurites, which were then compared with the results of the human anterior cingulate cortex19,21. The observed structural characteristics of mouse neurons were then translated into connection constraints and implemented in generative AIs in order to investigate their computational consequences. We examined the results in light of the network characteristics as well as the statistical nature of datasets used for training the generative AIs.

Structures of mouse brain tissues

Brain tissue structures of layer V of the medial prefrontal cortex of nine mice (M1–M9) were visualized with nano-CT (Supplementary Table 1), in the same manner as previously reported for the corresponding layer of the anterior cingulate cortex of eight human control cases19,21. Tomographic slices were reconstructed from the obtained X-ray images and stacked to reproduce three-dimensional structures (Fig. 1a, Supplementary Fig. 1). Then, the three-dimensional images of neuronal networks were traced by placing spherical nodes along the network to build their Cartesian-coordinate models (Fig. 1b), as previously reported for human cases (Fig. 1c, d)19,20,21. The traced structures of the nine mice consisted of 109 neurons, 5665 neuronal processes, and 65,541 dendritic spines in total (Supplementary Tables 2–4). The tissue structures of the eight human control cases consisted of 150 neurons, 4597 neuronal processes and 18,559 spines19,21. Differences between the human and mouse structures are discernible in Fig. 1, such as in the shape of neuronal somata and the spatial trajectory of neurites, which indicates that human and mouse cerebral tissues can be distinguished simply by visualizing their structures.

Fig. 1
figure 1

Three-dimensional structures of cerebral tissues of mouse and human. Brain tissues were stained with the Golgi method and visualized with nano-CT. Scale bars: 10 μm. (a) Rendering of mouse M8A structure (Supplementary Table 4) of the medial prefrontal cortex. Voxel values 40–800 were rendered in a perspective view by using the scatter HQ algorithm of the VG Studio software (Volume Graphics, Germany). Pial surface is toward the top. (b) Cartesian coordinate model of M8A drawn in a parallel view. Structures in the M8A image were traced in order to reconstruct them in Cartesian coordinate space by using the MCTrace software.50 Tissue constituents are color-coded. Nodes composing the model are drawn as octagons. The red box indicates the neurite shown in Fig. 3a. (c) Rendering of human N8A structure of the anterior cingulate cortex. Reproduced from ref. 21. (d) Cartesian coordinate model of N8A. The red box indicates the neurite shown in Fig. 3b.

Differences between human and mouse

The human and mouse brain tissues (Fig. 1) show differences in their neuronal soma shape. The somata of the mouse neurons are nearly spherical, while those of human are vertically triangular. Another difference is in their neurite structures. Mouse neurites are tortuous and thin, while human neurites are rather straight and thick (Fig. 1).

Figure 2 summarizes differences in the soma shape. The most significant difference is in the neuronal soma length, which was defined as the length along spherical nodes having a diameter larger than half the diameter of cell soma (Fig. 2a). This measure corresponds to the longitudinal size of the neuronal soma. Mean soma length for mouse is less than 60% that of human (Fig. 2b; p = 1.10 × 10− 6, two-sided Welch’s t-test, Holm–Bonferroni corrected, same hereafter unless otherwise noted). The mouse soma width, which was defined with the diameter of the largest spherical node in the neuron, is approximately 85% of the human soma width (Fig. 2b; p = 0.025). The shape difference is discernible from three-dimensional renderings of the neuronal somata (Fig. 2e). The mouse soma is small and spherical, and hence compatible with the volumetrically-confined cortex of the mouse brain. The human soma is long and wide, and hence, it results in the thick and wide cortex of the human brain.

The soma length difference was further examined by classifying neurons into pyramidal neurons and interneurons. Here, the difference was highly significant for pyramidal neurons (Fig. 2c; p = 6.3 × 10− 8). In contrast, the difference in interneuron length was insignificant (Fig. 2d), though interneurons are less abundant and were identified only in five mouse and three human cases. These results indicated that the structural difference between human and mouse soma observed in this study is ascribable to pyramidal neurons.

Fig. 2
figure 2

Differences in soma between human and mouse. (a) The soma width was defined with the diameter of the largest spherical node. The soma length was defined as the length along nodes having a diameter larger than half the soma width (red arrow). (b) Scatter plot of mean soma length and mean soma width. Mouse data are indicated with diamonds and human data with triangles. (c) Pyramidal neurons showed a significant difference in mean soma length (***p = 6.3 × 10− 8, two-sided Welch’s t-test, Holm–Bonferroni corrected). (d) Interneurons showed no significant difference in mean soma length. (e) Renderings of pyramidal soma. Mouse individuals (M1–M9) and human cases (N1–N8) are indicated with labels. Human renderings are reproduced from refs. 19 and 21. Image width × height: 25 μm × 40 μm.

Structural differences in the three-dimensional neuronal network were also examined according to the method reported for the human cases19,20,21. Typical neurite structures are shown in Fig. 3a and b. Major differences appear in the thickness and the tortuousness of the neurites. The tortuousness property can be represented with a parameter called curvature, which is reciprocal to the radius of the spatial trajectory. Mean neurite curvature for mice is over 1.8 times as large as that of humans (Fig. 3c; p = 0.0040). The neurite thickness of mice is less than 60% of that of humans (Fig. 3d; p = 0.0111). These results indicate that mouse neurites are significantly thinner and more tortuous compared with human neurites. The neurite thickness and curvature showed a reciprocal correlation (Fig. 3e), which was also observed for human neurons19,21.

The differences between human and mouse dendritic spine were rather moderate (Fig. 3f–i). The spine curvature for mice was approximately 25% larger than that of the human controls (Fig. 3f; p = 0.0063), but no significant differences were found for the mean diameter (Fig. 3g) or for the length (Fig. 3h). The mean spine density of mice was 2.6 times as high as that of humans (Fig. 3i; p = 0.0070), though the staining efficiency of the Golgi method used in this study should have affected the density estimates.

Fig. 3
figure 3

Differences in neurite and spine. (a) Mouse neurite indicated with the red box in Fig. 1b. Traced structures (green) are superposed on the observed image (gray). The image is contoured at three times the standard deviation (3 σ) of the voxel intensity with a grid spacing of 96.2 nm. (b) Human neurite indicated with the red box in Fig. 1d. The image is contoured at 3 σ with a grid of 97.4 nm. (c) Difference in neurite curvature (**p = 0.0040, two-sided Welch’s t-test, Holm–Bonferroni corrected, same hereafter). (d) Neurite thickness radius (*p = 0.0111). (e) Scatter plot of neurite curvature and thickness illustrates their reciprocal correlation. Mouse data are indicated with diamonds and human data with triangles. (f) Spine curvature (**p = 0.0063). (g) Spine diameter and (h) spine length showed no significant difference. (i) Spine density (**p = 0.0070).

Implementation in generative AIs

Major structural differences between human and mouse were found in (1) the shape of the pyramidal soma and in (2) the thickness and curvature of the neurites. These differences should be related to each other, and this issue is addressed in the discussion section below. The thin and tortuous neurites in the mouse brain should suppress connections between distant neurons according to cable theory30.

We incorporated these findings in the generator of a deep convolutional generative adversarial network (DC-GAN)31 and examined its performance in image generation tasks. The structural characteristics of mouse neurons were implemented with reference to the schizophrenia-mimicking layer, in which inter-node connections are suppressed depending on the distance between nodes29. In this study, the inter-node distance was defined on the basis of two-dimensional arrangements of nodes (Fig. 4a) to reproduce the laminar organization of neurons in the cerebral cortex. Connections outside the window cone in Fig. 4a were kept at zero in the weight matrix in order to confine the connections to be within the cone.

This mouse-mimetic convolutional layer was used as the hidden layers of the DC-GAN generator (Supplementary Table 5), while other configurations of the model were kept the same as in the original report31. The reduction in the number of weights in the generator is summarized in Supplementary Table 6. The GAN discriminator is composed of four standard convolutional layers and a fully-connected top layer (Supplementary Table 7). GANs consisting of these generators and discriminators were trained from scratch by using photo datasets of cat faces (Animal Faces HQ)32, cheese (Cheese Pics), human faces (CelebA)33, birds (Birds 525 Species), and automobiles (60000 + Images of Cars). After the training, each GAN was used to generate fake photos, which were scored with the Fréchet inception distance (FID)34.

The obtained results indicated that the mouse-mimetic GAN outperformed the standard GAN in the image generation task using the cat faces and cheese datasets, but underperformed for human faces and birds (Fig. 4b, c, Supplementary Fig. 2, Supplementary Table 8). Examples of generated photos (Supplementary Fig. 3) of cat faces showed a slight mode collapse without using mouse-mimetic layers. Indeed, the FID scores for cat faces and cheese decreased as the parameter usage in the mouse-mimetic layers was reduced (Fig. 4b). Linear regression of the FID scores to the parameter usage ratio showed positive slopes (β = 0.121, p = 0.026 for cat faces between 20 and 100%; β = 0.38, p = 1.81 × 10− 6 for cheese between 35 and 100%, Holm–Bonferroni corrected). These results indicate that the mouse-mimetic GAN outperformed the standard GAN on the cat faces and cheese datasets. The best FID score for cat faces was obtained when only 35% of the weights were used in the mouse-mimetic layers. Whereas the standard GAN failed to converge in seven out of ten training runs using the cat faces dataset (Supplementary Table 8), the mouse-mimetic GAN with a parameter usage of 35% or less failed to converge in only one out of 20 training runs. The course of the FID scores (Supplementary Fig. 4) indicated that the incorporation of the mouse-mimetic layers stabilized the GAN. The best FID score for automobiles was also obtained at 35% parameter usage, similar to the results for the cat face and cheese datasets (Supplementary Fig. 2). We also implemented mouse-mimetic layers in the discriminator (Supplementary Table 7) and trained it with the cat faces dataset, though all ten training runs failed to converge (Supplementary Table 8), indicating that the mouse-mimetic design is effective only in the generator. Contrary to the results of the cat faces, cheese, and automobiles datasets, the FID scores for human faces and birds increased as the parameter usage was reduced (Fig. 4c; linear regression between 35 and 100%: β = − 0.164, p = 3.4 × 10− 9 for human faces; β = − 0.21, p = 9.2 × 10− 6 for birds, Holm–Bonferroni corrected). These results indicate that the mouse-mimetic GAN underperformed the standard GAN in generating images of human faces and birds.

Fig. 4
figure 4

Image generation using mouse-mimetic AIs. (a) Schematic representation of convolutional layer. The standard convolutional layer is fully connected along the channel dimensions, while the mouse-mimetic layer is partially connected. Nodes in the mouse-mimetic layer are arranged in a two-dimensional manner to reproduce the laminar organization of neurons. An example of the connection window is shown with a cone. The degree of connection is defined by the radius of the cone base circle, which is expressed as a fraction of the total width of the node plane. The base radius was calculated from the %usage of weight parameters. (b) Fréchet inception distance (FID) scores34 of cat faces (circles) and cheese (triangles) photos generated using a generative adversarial network (GAN) are plotted against the percent usage of weights in the mouse-mimetic layers and against the fractional radius of the window. 100% use of parameters corresponds to the standard network. The training and evaluation were repeated for ten runs for each %usage. A total of 12 runs using the cat faces dataset did not converge and had FID scores greater than 200 (Supplementary Table 8). Lines indicate the mean FID scores of the converged runs. Red symbols indicate the best FID scores. (c) FID scores of human faces (circles) and bird (triangles) photos generated using the GAN. All runs converged, and their FID scores are plotted. Lines indicate mean FID scores. Red symbols indicate the best FID scores. (d) FID scores of photos generated using denoising diffusion implicit models (DDIMs). The parameter usage was set to 44% in the mouse-mimetic DDIM (marked ‘M’) and100% in the control standard DDIM (marked ‘C’). Labels indicate the datasets used for training (AFHQ cat: cat faces; CelebA: human faces; Birds 525: birds; Cheese Pics: cheese). Differences in the FID scores between the mouse-mimetic and standard DDIMs were examined using a two-sided Welch’s t-test and corrected with the Holm–Bonferroni method (AFHQ cat: ***p = 0.000131; Birds 525 Species: ***p = 0.00089; Cheese Pics: ***p = 0.00088). (e) Examples of cat face and cheese images generated from the best runs of the mouse-mimetic and standard DDIMs. (f) Statistics of datasets used for training. Values are normalized to those of the CelebA human faces dataset. Circles indicate statistics for the Animal Faces HQ cat dataset, closed triangles those for Cheese Pics, closed diamonds for 60,000 + Images of Cars, and open triangles for Birds 525 Species. Color channels are color-coded.

Next, we examined a conditional GAN35 model by using a merged dataset composed of the cat face, human face, bird, and cheese datasets. Approximately 5,000 images from each dataset were extracted by sharding and used for training the conditional GAN, of which the generator and discriminator were the same as those in Supplementary Tables 5 and 7, except for adding image class inputs. The parameter usage was set to 34.2% in the mouse-mimetic layers. The FID scores of the photos generated for the human-face and bird image classes increased as a result of incorporating mouse-mimetic layers (Supplementary Fig. 5; Supplementary Table 9; human face: p = 1.10 × 10− 5; bird: p = 0.0114). This indicates that the mouse-mimetic GAN underperformed the standard GAN. In contrast, the FID scores for cat faces and cheese showed larger variances and no significant difference between the mouse-mimetic and standard GANs (Supplementary Fig. 5; Supplementary Table 9). The best FID score for cheese was obtained using the mouse-mimetic conditional GAN.

We also implemented mouse-mimetic convolutional layers in the U-Net36 of a denoising diffusion implicit model (DDIM)37 and trained the network on the cat face, human face, bird, and cheese datasets individually. The mouse-mimetic DDIM, in which only 44% of the weights of the U-Net were used (Supplementary Table 10), outperformed the control standard DDIM in the image generation task using the cat faces and the cheese datasets (cat faces: p = 0.000131; cheese: p = 0.00088; Fig. 4d, Supplementary Table 11). In contrast, the mouse-mimetic DDIM underperformed the standard DDIM for birds (p = 0.00089) and showed no significant difference for human faces (Fig. 4d, Supplementary Table 11). Examples of the generated images are shown in Fig. 4e and Supplementary Fig. 6. The cat photo examples generated by the standard DDIM seem dominated by tabby cats, while those of the mouse DDIM seem rather diverse. The mouse-mimetic DDIM was slower to converge on the FID score, but was more stable and less overfitting than the standard DDIM (Supplementary Fig. 7). These results indicate that (1) the number of parameters of the U-Net in the DDIM can be reduced to less than half the original U-Net by introducing mouse-mimetic layers and that (2) the mouse-mimetic parameter reduction can improve the DDIM performance under certain conditions.

Since the results of the GAN and the DDIM are similar independently of the network architecture and the image generation algorithm, the performance differences should be due to differences between datasets. We examined the statistics of the five datasets and found differences between their image entropy distributions (Fig. 4f; Supplementary Table 12). Though the datasets are nearly the same in terms of the mean image entropy, the standard deviation of the image entropy differs between the datasets. The frequency distribution of the image entropy of the Animal Faces HQ cat faces and the 60,000 + Images of Cars datasets showed sharp peaks (Supplementary Fig. 8a, e), indicating that the images of these datasets have similar intensity histograms. In contrast, the Cheese Pics dataset showed tails toward the low entropy side (Supplementary Fig. 8b). This represents that the dataset contains images showing uniform intensity, such as those of cheese blocks. The CelebA human faces and Birds 525 Species datasets showed profiles in between them (Supplementary Fig. 8c, d). The results of an AKAZE local feature matching38 indicated that the cheese dataset is different from the other three datasets in terms of the standard deviation of the matching distance (Fig. 4f). The number of detected feature points differed between the datasets, though the cheese and bird datasets showed similar values (Fig. 4f). Another difference between the datasets was in the number of images constituting each dataset (Supplementary Table 12). We prepared a partial dataset consisting of 5199 images from the CelebA dataset and trained the above unconditional GAN on it. The obtained FID scores did not show a positive slope, and the best FID was obtained by using the standard GAN (Supplementary Fig. 12), as in the case of the complete CelebA dataset. These results indicate that the dataset size itself does not determine the relationship between the FID score and the parameter usage ratio.

Discussion

The results of this study indicated that the neurons of human and mouse differ in a number of structural parameters, including soma length (Fig. 2) and neurite thickness (Fig. 3d). A difference in size of neuronal soma between humans and mice was also reported for hippocampal CA1 neurons14. The soma downsizing and the neurite thinning allow neurons to be integrated in a limited thickness and area of the mouse cerebral cortex and, therefore, should be the results of adaptation to the limited volume of the mouse brain. In contrast to the mouse brain, the human brain occupies a large volume and bears a thick and folded cerebral cortex1,2,3, which can accommodate long and thick neurons.

Since the mean stem-dendrite diameter is proportional to the neuronal soma diameter39, the differences in soma size and neurite diameter should be related to each other. The soma size correlates also with the dendrite extent.40 These results along with ours suggest that a neuron with a large soma has thick dendrites innervating widely, while a neuron with a small soma has thin dendrites wherein the network is confined to the vicinity of the soma. It has been reported that a neuregulin-4 knock-out mouse showed a reduced soma size of pyramidal neurons in the motor cortex and exhibited defects in motor performance.41 A mouse model of Rett syndrome exhibited reduced dendritic arborization42 and a reduced soma size43. Therefore, the soma downsizing and the neurite thinning in the mouse brain should correlate with each other and affect the performance of the neuronal network.

Differences between the dendritic spines were observed in their curvature and density (Fig. 3f–i). Since the spine curvature reciprocally correlates with the spine thickness20 and the spine thickness correlates with the neurite thickness20, the tortuous spines in the mouse brain should affect the neuronal network concomitantly with the neurite thinning. The mouse spine density determined with our method was 2.6 times as high as that of humans (Fig. 3i). Though the staining efficiency of the Golgi impregnation used in our studies should have affected the density estimation, a similar difference in synapse density between humans and mice was reported in a dense connectomic study.17 The two- to three-fold higher spine density in mice can compensate for the 2- to 3-fold thinner cortex compared to the human cortex, allowing neurons to form a comparable number of spines per cortical area. We suggest that dendritic spines of humans and mice are regulated to keep their areal density constant.

This study compared three-dimensional structures of neurons between the medial prefrontal cortex of mice and the anterior cingulate cortex of humans. The comparison revealed multiple differences as described above. It has been reported that the anterior cingulate cortex of humans exerts emotional and cognitive functions44,45. Moreover, neurons in the medial prefrontal cortex of mice receive long-range inputs from brain regions involved in cognition, motivation, and emotion46. Therefore, the neuronal differences observed in this study should be relevant to cognitive and emotional brain functions in humans and mice, including functions for finding food and fleeing from enemies.

The structural comparison of human and mouse neurons suggested that neuronal connections are spatially confined in the mouse brain network. We incorporated this finding in DC-GAN and DDIM by masking the weight matrix depending on the two-dimensional distance between nodes. The resultant FID scores for human faces and birds were worsened by the parameter reduction for both the GAN and the DDIM (Fig. 4c, d). The constraints in the weight matrix should have degraded the performance of the generative AIs. In contrast, the FID scores for cat faces and cheese were improved by the parameter reduction (Fig. 4b, d). The sparse weight matrix of the mouse-mimetic layer can eliminate extra degrees of freedom and thus may have improved the performance. These results suggest that the number of parameters appropriate for generating cat faces and cheese photos is lower than those for human faces and birds. The datasets used in this study showed differences in image entropy, which can affect the number of parameters required for the image generation task. We suggest that the statistics of the input dataset including the image entropy determine the degree of freedom suited for generating images from that dataset.

The results for the mouse-mimetic networks indicated that the number of parameters can be adjusted by using the mouse-mimetic layers without changing the network architecture or dimensions. Since the shape of the weight matrix of the mouse layer is the same with that of its original one, any convolutional layer can be replaced with its mouse-mimetic version. Although the parameter %usage needs to be optimized to obtain the best result, we suggest 30–50% as a first choice to examine whether the mouse-mimetic layer works in the target application. Since it has been reported that the brain-wide circuitry associated with higher intelligence is organized in a sparse manner47, parameter reduction should also be a strategy to improve the performance in biological systems. The results of this study indicated that the number of parameters used for the image generation task can be halved by introducing mouse-mimetic layers. Although the computation time for training the mouse-mimetic AI is the same with that of its standard version in the present implementation, there is a possibility to reduce the computational cost by taking account of the sparse architecture of the mouse-mimetic layer into the library code of the neural network calculation.

The ages of the mice used in this study were in the young adult range, while the human cases ranged in age from their early 40’s to 70’s21. This age difference can affect the structure of the neuronal network, though a significant correlation between the age and the structural parameter was observed only for the standard deviation of the neurite curvature in the human study21. We suggest that age-related effects on the human neuronal structure are rather limited compared with the differences between human and mouse. Another limitation of this study is the difference in the tissue fixation method. The mouse tissues were fixed with a perfusion fixation, while the human tissues were fixed by immersing them in a fixation solution. It has been reported that perfusion fixation results in equal or greater subjective histological quality compared with immersion fixation, though their quantitative difference is unclear48. A study on monkey brains49 indicated that the soma size was smaller in immersion-fixed tissue than in perfusion-fixed tissue. Therefore, the soma size difference between human and mouse may be larger than the difference observed in our structural analysis.

In this study, we analyzed three-dimensional structures of mouse brain tissues and compared them with those of human brain tissues. The neurons in the mouse brain showed comparably downsized somata and thin neurites, allowing the neurons to be integrated in a limited space of the mouse brain. This finding was applied to generative AIs to examine its computational consequences. The mouse-mimetic AIs outperformed the conventional AIs in image generation tasks using cat faces and cheese datasets. The preferences of the mouse-mimetic AIs coincided with the impressions commonly associated with mice, though its biological implication remains to be clarified. The structure of mouse neurons would have adapted to the environment that mice live in to optimize their brain functions for their survival. We suggest that relationship between neuronal structure and brain function should be investigated by implementing other biological findings in artificial neural networks.

Methods

Human cerebral data

Human data were obtained in our previous studies19,21. Post-mortem human cerebral tissues were collected with written informed consent from the legal next of kin using protocols approved by the Clinical Study Reviewing Board of Tokai University School of Medicine (application no. 07R-018). The human studies were conducted according to the Declaration of Helsinki. The method used for the structural analysis of human tissues19,21 is the same as the method applied to the mouse samples described below.

Mouse cerebral tissue

Nine male C57BL/6J mice were housed in a conventional environment with ad libitum feeding under a 12 h / 12 h light and dark cycle. All mouse experiments were performed with the approval from the Institutional Animal Care and Use Committee at the Tokyo Metropolitan Institute of Medical Science (protocol code: 18028) in accordance with relevant guidelines and regulations. The mice at 16 weeks of age were deeply anesthetized with an overdose of sodium pentobarbital and subjected to perfusion fixation using phosphate-buffered saline containing 4% formaldehyde. Brains were dissected and stained with Golgi impregnation. Left medial prefrontal cortexes were taken from the stained tissues and embedded in epoxy resin. The staining and embedding procedures were performed as reported previously19. This study is reported in accordance with the ARRIVE guidelines.

Synchrotron radiation X-ray microtomography and nanotomography

The resin-embedded samples were first subjected to simple projection microtomography at the BL20XU beamline51 of the SPring-8 synchrotron radiation facility to visualize the structure of the entire tissue. The data collection conditions are summarized in Supplementary Table 1. Absorption contrast X-ray images were collected using a visible-light conversion type X-ray imaging detector consisting of a phosphor screen, optical lenses and complementary metal-oxide semiconductor (CMOS) camera (ORCA-Flash4.0, Hamamatsu Photonics, Japan), as reported previously19. Three-dimensional structures of the samples were reconstructed with the convolution back-projection method in order to determine layer positions. An example of the structure is shown in Supplementary Fig. 10.

Tissue volumes corresponding to layer V were then visualized with synchrotron radiation X-ray nanotomography (nano-CT) equipped with Fresnel zone plate optics52. The nano-CT experiments were performed as reported previously19,20,21 at the BL37XU53 and the BL47XU54 beamlines of SPring-8, and at the 32-ID beamline55,56 of the Advanced Photon Source of Argonne National Laboratory. The data collection conditions are summarized in Supplementary Table 1. Photon flux at the sample position in the BL47XU experiment was determined to be 1.0 × 1015 photons/mm2/s by using Al2O3:C dosimeters (Nagase-Landauer, Japan). Photon fluxes at the BL37XU and at the 32-ID beamlines were reported previously21. Spatial resolutions were estimated from the Fourier domain plot57 or by using three-dimensional test patterns58.

Tissue structure analysis

The obtained datasets were processed with the convolution-back-projection method in the RecView software59 (RRID: SCR_016531; https://mizutanilab.github.io/) to reconstruct three-dimensional images (Fig. 1a), as reported previously19,21. The image reconstruction was conducted by RS and KS. The obtained image datasets were shuffled with human datasets from our previous study21 and provided to RM without any data attributes. RM built Cartesian coordinate models of tissue structures from the three-dimensional images using the MCTrace software50 (RRID: SCR_016532; https://mizutanilab.github.io/). The models were built by tracing structures in the three-dimensional image as reported previously19,20,21. Each model was built automatically, then edited manually. The resultant models consisted of consecutive spherical nodes with diameters corresponding to the width of the traced structures. After the model building of each dataset was finished, the coordinate files of the structural models were locked down. Then, RM reported the number of neurite segments to RS. RS aggregated the numbers in order to equalize analysis amounts between mouse individuals and between human and mouse. Datasets that should be further analyzed were determined by RS and provided to RM without any data attributes or aggregation results.

These model building procedures were repeated for three batches of nano-CT datasets. The first two batches consisted of eight mouse datasets (M1A, M1B, M1C, M2A, M2B, M3A, M3B, and M4A) and 12 human datasets of our previous study21. The information of these datasets was disclosed to RM after the model building of these two batches was finished in order to report the human study21. The other batch consisted of 13 mouse datasets, two dummy mouse datasets unrelated to this study, and eight human datasets. The dummy mouse and human datasets were included in order to shuffle the datasets. After the model building of the entire batch finished, the dataset collection date was first disclosed to RM to correct the voxel width which was tentatively defined. After the voxel width was corrected, the coordinate files were locked down and all attributes of the datasets were opened to assign each dataset to mouse individuals.

The human datasets and the two dummy mouse datasets unrelated to this study were not used in the subsequent analysis. All other 21 datasets were subjected to structural analysis. Structural parameters were calculated from Cartesian coordinate models by using the MCTrace software, as reported previously19,21. The soma width was defined with the diameter of the largest spherical nodes composing the neuron. The soma length was defined as the length along spherical nodes having a diameter larger than half the soma width (Fig. 2a). Statistics of the obtained structural parameters are summarized in Supplementary Tables 2–4.

Photo image datasets

The photo image datasets used for training generative AIs were taken from hyperlinks in the Datasets page of the kaggle.com website (https://www.kaggle.com/datasets). The CelebA dataset33 was used as the dataset of human-face photos. A partial CelebA dataset consisting of the first 5199 images in numerical filename order was also prepared to examine the effect of the number of images per dataset. The train-cat folder of the Animal Faces HQ dataset32 was used as the dataset of cat face photos. The Birds 525 Species dataset was used as the dataset of bird photos. The Cheese Pics dataset was used as the source of cheese photos. Since the Cheese Pics dataset contains photos of humans, buildings, packages, and cooked foods, we examined the contents of all 1824 folders of this dataset and chose 165 folders (Supplementary Table 13) which mainly contain cheese photos. The 60,000 + Images of Cars dataset was used as the automobile photo dataset. All images of these datasets were scaled to 64 × 64 pixel dimensions and to pixel values in a 0–1 range prior to training.

The CelebA dataset is widely used in image generation studies (e.g., Refs.34,37) and can be considered to be a control dataset against the Animal Faces HQ cat face dataset. The Birds 525 Species dataset consists of photos of entire bodies of birds. An image of the entire body is different from that of the face, regardless of animal species. Hence, the Birds dataset was chosen as examples of natural objects different from faces. Cheese is artificially prepared and has a different appearance from those of natural objects, such as faces and birds. Hence, the Cheese Pics dataset was chosen as representative of artificial objects. The 60,000 + Images of Cars dataset was chosen as other examples of artificial objects for the unconditional GAN calculation.

The analysis of the dataset statistics and the AKAZE local feature matching38 were conducted using the OpenCV-4.10.0 library (https://opencv.org/releases/). The AKAZE feature matching was performed using the first 2000 images of each dataset. Prior to the feature point detection, the images in the 64 × 64 pixel dimensions were converted to gray scale and resized to 256 × 256 pixels.

Generative adversarial network

The structural analysis of mouse cerebral tissue indicated that mouse neurites are thin and tortuous compared with those of humans. This means that neuronal connections are confined depending on the distance between neurons. We incorporated this finding in an artificial neural network by masking weights with a window matrix, which are multiplied with the weight matrix in an element-by-element manner29. In this study, the inter-node distance was defined by assuming a two-dimensional arrangement of nodes (node position x, y) in order to mimic the laminar organization of neurons in the cerebral cortex. The inter-node distance was defined along the channel dimensions (Fig. 4a). Fractional coordinates (x / total nodes along x dimension, y / total nodes along y dimension) were used to calculate the Euclidean distance between nodes. Elements of the window matrix were set to 1 if the fractional distance between a node pair is less than a predefined threshold, and set to 0 if the distance is equal to or larger than the threshold.

The mouse-mimetic convolutional layers with the two-dimensional window were implemented in the generator of the DC-GAN31 (Supplementary Table 5). The discriminator of the GAN was made of four standard convolutional layers and a fully-connected top layer (Supplementary Table 7). Batch normalization60 was applied to hidden layers of the generator, and spectral normalization61 to hidden layers of the discriminator. ReLU activation function was used in the hidden layers of the generator, and leaky ReLU in the hidden layers of the discriminator.

Conditional GAN35 was performed by using the same configurations as the unconditional GAN described above, except for adding image class inputs. The datasets of CelebA, Birds 525 Species, and Cheese Pics were sharded in order to extract approximately 5,000 images for each image class. The extracted images of the three image classes along with the entire Animal Faces HQ cat faces dataset were merged and shuffled to compose a class-labeled dataset used for training the conditional GAN model.

These GAN models were trained from scratch using the Adam algorithm62 with a batch size of 32, learning rates of 1 × 10− 4, and β1 of 0.5 for both the generator and the discriminator. The discriminator was trained once per cycle. The number of training epochs was set to 25 for the CelebA dataset. This corresponds to approximately 160,000 steps of training. The number of training epochs for other datasets was determined so as to set the total number of steps to be approximately 80,000. The Fréchet inception distance (FID)34 was calculated from 51,200 images by using the code provided at https://github.com/jleinonen/keras-fid. The training and evaluation were repeated for 10 runs. Examples of the FID score progress during training are shown in Supplementary Fig. 4. The calculations were performed using Tensorflow 2.16.1 and Keras 3.3.3 running on the g4dn.xlarge instance (NVIDIA T4 Tensor Core GPU with Intel Xeon Cascade Lake P-8259CL processor operated at 2.5 GHz) of Amazon Web Service. The training for 80,000 steps took typically 3 h. Python codes are available from our GitHub repository (https://mizutanilab.github.io).

Denoising diffusion implicit model

The mouse-mimetic convolutional layers were implemented in the U-Net36 of the DDIM37. The U-Net was downsized to [64, 128, 256, 512] channels from [128, 256, 256, 256, 512] channels of the original implementation37 for the CelebA dataset to halve the computation time required for training. The mouse-mimetic parameter reduction was applied to the convolutional layers of the residual blocks with channel dimensions of 256 and 512. The parameter %usage is summarized in Supplementary Table 10. Attention layers were introduced at the 16 × 16 feature map resolution. Group normalization63 and a swish activation function64 were used in accordance with the GitHub repository of the original DDIM report37.

The DDIM was trained from scratch using the Adam algorithm62 with a batch size of 64, learning rate of 2 × 10− 4, and β1 of 0.9. The number of training epochs was determined so as to set the total number of steps to be approximately 500,000. The FID was calculated from 51,200 images generated by 20 diffusion steps. The training and evaluation were repeated for 5 runs using the same environment as the GAN. Examples of the FID scores’ progress during training are shown in Supplementary Fig. 7. The DDIM training for 500,000 steps typically took about 70 h. Python codes are available from our GitHub repository (https://mizutanilab.github.io).

Statistical tests

The statistical tests of the structural parameters and FID scores were performed using the R4.4.0 software, as reported previously19,20,21,29. Significance was defined as p < 0.05. Differences in the means of the structural parameters between human and mouse were examined using a two-sided Welch’s t-test. The relation between the mean FID score and the mean parameter usage ratio was examined by linear regression analysis. Differences in the means of the FID scores were examined using a two-sided Welch’s t-test. The p-values of multiple tests were corrected with the Holm–Bonferroni method.