Fig. 1: Schematic diagram of the CytoCommunity algorithm. | Nature Methods

Fig. 1: Schematic diagram of the CytoCommunity algorithm.

From: Unsupervised and supervised discovery of tissue cellular neighborhoods from cell phenotypes

Fig. 1

Given single-cell spatial maps with cell phenotype annotation and cell spatial coordinates, TCN identification is formulated as a community detection problem on graphs. a, The algorithm includes a soft TCN assignment module and a TCN ensemble module. First, a k-NN-based cellular spatial graph is constructed using cell spatial coordinates. Each node represents a cell and its m-dimensional attribute vector (blue) encodes the cell phenotype. m, number of cell phenotypes; n, number of cells. A basic GNN is applied to this cellular spatial graph to obtain a d-dimensional embedding vector (green) for each node. Embedding dimensions are specified according to users. A fully connected neural network is used to transform cell node embeddings to soft TCN assignments (yellow vectors) of nodes, representing the probabilities of cells belonging to c TCNs. The number of TCNs are specified according to users. The graph MinCut-based loss function (LMinCut) is used to learn the optimal soft TCN assignments of all nodes. This loss function can be used alone for an unsupervised learning task. In a supervised learning task, differentiable graph pooling, graph convolution and two fully connected layers with the cross-entropy loss function LCE (for sample classification, bordered by a dashed rectangular box) are added on top of the soft TCN assignment module. The overall supervised loss function is a linear combination of LMinCut and LCE with a weight parameter β. In the TCN ensemble module, the first module can be run multiple times to generate multiple optimal soft TCN assignment matrices. Hard assignment is conducted for each of them and an ensemble procedure is performed on those hard TCN assignments using a majority vote strategy to determine the final robust TCNs. b, For an unsupervised learning task, CytoCommunity identified TCNs for each tissue sample individually. c, For a supervised learning task, using a dataset of tissue samples associated with different conditions as the input, CytoCommunity enabled de novo identification of condition-specific TCNs under the supervision of sample labels.

Back to article page