Fig. 2: Comparing four segmentation strategies.

a Graphic describing segmentation strategies performed in CellProfiler. Identified nuclei are expanded on by a select number of pixels to create a cell mask with the pixel expansion strategy. The propagation strategy uses both identified nuclei and a membrane probability map generated by pixel classification in Ilastik to create a cell mask, using the propagation thresholding parameter in CellProfiler. In the sequential segmentation pipeline, each step uses custom size threshold settings for nucleus detection, as well as cell type-specific markers to generate the probability maps for membranes in Ilastik. A propagation step as described in (a) is subsequently used to find cell boundaries in steps 1, 2, 4, 5 and 7. At every level, the identified objects are subtracted from the total remaining nuclei. Any remaining cells at step 9 are segmented using one-pixel expansion. Steps 3, 6, 8 and 10 describe the segmentation of tissue domains, based on probability maps in Ilastik. b Representative false colour images of F4/80 (green), CD4 (yellow) and aSMA (red) markers merged with a nuclear stain (blue) and cropped (top row). Cell mask outlines (white) overlaid onto these markers were generated by either three-pixel expansion strategy (second row) or propagation strategy (third row), one-pixel expansion strategy (fourth row) or sequential segmentation (bottom panel). Image processing for visualisation: outliers were removed and a median filter of 0.5-pixel radius was applied in Fiji. c Heatmap showing enrichment of markers as a result of each segmentation strategy. Enrichment was defined as the relative expression in the top 500 cells of the markers as listed on the x-axis, compared to the expression of those same markers in the rest of the cells in the dataset. d Heatmap depicting the relative amount of noise in each segmentation strategy by looking at the relative expression of the key identifying marker, compared with markers that would not be expressed on the same cell, but may be found in its direct proximity within the tissue and thus would be a sign of signal bleed from adjacent cells (“signal/noise”). e Percentage of cells from the manually annotated dataset that were matched with cells in the top 500 for each marker, compared between segmentation strategies. Size of annotated datasets: CD103+ DCs, n = 65; CD4+ T cells, n = 92; CD8+ T cells, n = 70; F4/80+ macrophages, n = 108; αSMA+ fibroblasts, n = 130. f Graphic showing the result of the domain segmentation as part of the sequential segmentation method, red = normal tissue, purple = tumour, green = interface, cyan = structural domain. Violin plot: Quantification of two key markers PECAM and CD44 used as the basis for the domain segmentation. px pixel.