Extended Data Fig. 3: Analysis of Nicheformer attention to contextual and gene tokens.
From: Nicheformer: a foundation model for single-cell and spatial omics

A) Shown are different attention matrices extracted from the last transformer block of Nicheformer. They present a similar pattern in which almost all attention is paid to the metadata tokens. B) Average attention paid, per layer, to the metadata tokens. It can be observed a clear trend: the last layers of the model pay, by a large margin, the most attention to the metadata tokens. The analysis is done in both male and female brain mouse datasets to showcase that the pattern is consistent. C) Shown are box plots representing the distribution of attention paid to contextual tokens (orange) and gene tokens (blue) in the latest Nicheformer’s layers. The p-values are the result of performing Mann-Whitney U tests to assess whether there is a significant difference between the distribution of attention paid to contextual and gene tokens. To control the false discovery rate (FDR), we applied the Benjamini-Hochberg procedure to adjust the p-values. D) Shown are box plots representing the distribution of attention paid to gene tokens in 3 groups of layers: early (from layer 1 to layer 5), middle (layer 6 to layer 9) and late (from layer 10 to layer 12). The p-values are the result of performing Mann-Whitney U tests to assess whether there is a significant difference between the distribution of attention paid to contextual and gene tokens. To control the false discovery rate (FDR), we applied the Benjamini-Hochberg procedure to adjust the p-values.