Fig. 4
From: Structural differences between human and mouse neurons and their implementation in generative AIs

Image generation using mouse-mimetic AIs. (a) Schematic representation of convolutional layer. The standard convolutional layer is fully connected along the channel dimensions, while the mouse-mimetic layer is partially connected. Nodes in the mouse-mimetic layer are arranged in a two-dimensional manner to reproduce the laminar organization of neurons. An example of the connection window is shown with a cone. The degree of connection is defined by the radius of the cone base circle, which is expressed as a fraction of the total width of the node plane. The base radius was calculated from the %usage of weight parameters. (b) Fréchet inception distance (FID) scores34 of cat faces (circles) and cheese (triangles) photos generated using a generative adversarial network (GAN) are plotted against the percent usage of weights in the mouse-mimetic layers and against the fractional radius of the window. 100% use of parameters corresponds to the standard network. The training and evaluation were repeated for ten runs for each %usage. A total of 12 runs using the cat faces dataset did not converge and had FID scores greater than 200 (Supplementary Table 8). Lines indicate the mean FID scores of the converged runs. Red symbols indicate the best FID scores. (c) FID scores of human faces (circles) and bird (triangles) photos generated using the GAN. All runs converged, and their FID scores are plotted. Lines indicate mean FID scores. Red symbols indicate the best FID scores. (d) FID scores of photos generated using denoising diffusion implicit models (DDIMs). The parameter usage was set to 44% in the mouse-mimetic DDIM (marked ‘M’) and100% in the control standard DDIM (marked ‘C’). Labels indicate the datasets used for training (AFHQ cat: cat faces; CelebA: human faces; Birds 525: birds; Cheese Pics: cheese). Differences in the FID scores between the mouse-mimetic and standard DDIMs were examined using a two-sided Welch’s t-test and corrected with the Holm–Bonferroni method (AFHQ cat: ***p = 0.000131; Birds 525 Species: ***p = 0.00089; Cheese Pics: ***p = 0.00088). (e) Examples of cat face and cheese images generated from the best runs of the mouse-mimetic and standard DDIMs. (f) Statistics of datasets used for training. Values are normalized to those of the CelebA human faces dataset. Circles indicate statistics for the Animal Faces HQ cat dataset, closed triangles those for Cheese Pics, closed diamonds for 60,000 + Images of Cars, and open triangles for Birds 525 Species. Color channels are color-coded.