Figure 1
From: Northstar enables automatic classification of known and novel cell types from tumor samples

Northstar concept and scalability. (A) Northstar’s input: the gene expression table of the tumor dataset and the cell atlas. Annotated cell type averages are depicted by colored stars, unannotated new cells by green circles. (B) Similarity graph between atlas and new dataset. (C) Clustering the graph assigns cells to known cell types (stars) or new clusters (pink and purple, bottom left and right). Cell types themselves do not split or merge. (D) Typical code used to run northstar. (E) Number of cell types with at least 20 cells in Tabula Muris (FACS data, pink) and Tabula Muris Senis (10 ×/droplet data, grey), subsampled to different sizes2, 11. (F) Memory needed to store the Tabula Muris Senis atlas, subsampled to different sizes as in E, as a full atlas and using the two approaches within northstar. Subsample assumes 20 cells per cell type. Memory for the new dataset to be annotated should be added to this footprint independently of the classification algorithm.