Fig. 6: Evaluation of PHLEX segmentation performance compared to standard approaches.
From: Deep cell phenotyping and spatial analysis of multiplexed imaging with TRACERx-PHLEX

a Instance segmentation similarity metric (Al-kofahi et al.), Dice, precision and recall scoring performance of deep-imcyto nuclear segmentation vs other publicly available methods: Mesmer, Cellpose, the Stardist “versatile” model and Stardist trained on DSBowl 2018 data, as well as a Stardist model retrained with the TRACERx nuclear IMC segmentation dataset (TRACERx NISD). Each score is calculated per image, and the test dataset covers 6453 nuclei across n = 16 TRACERx NISD images, which were not included in the training of any of the models. Significance values indicate the results of a two-tailed Mann–Whitney U test. b Heatmap summary of the mean segmentation performance of each metric shown in Supplementary Fig. 8. Upper panel shows scores, where higher values indicate superior performance. The lower two panels show scores, where a lower value is indicative of a better performance. *Bijective cardinality was normalised by the total possible number of correct detections in the test dataset. c Qualitative comparison between the deep-imcyto simple segmentation workflow (1 pixel dilation) and Mesmer, as well as the MCCS procedure run in deep-imcyto’s CellProfiler mode. All methods perform well at identifying cellular material; however, MCCS captures challenging cell morphologies and identifies non-nucleated stromal cell content (ɑSMA - putative fibroblasts in yellow and CD31 - endothelial cells in teal). Five example tiles from five different tissue cores (three tumour, one benign tumour-adjacent, one lymph node) from the TRACERx 100 study (T cells & Stroma antibody panel). All tiles are 256 × 256 µm, scale bar = 75 µm. Box plots in (a) show lower and upper quartile values, and whiskers extend up to 1.5*IQR above and below the quartiles. Source data are provided as a Source Data file.