UNSEG: unsupervised segmentation of cells and their nuclei in complex tissue samples

Kochetov, Bogdan; Bell, Phoenix D.; Garcia, Paulo S.; Shalaby, Akram S.; Raphael, Rebecca; Raymond, Benjamin; Leibowitz, Brian J.; Schoedel, Karen; Brand, Rhonda M.; Brand, Randall E.; Yu, Jian; Zhang, Lin; Diergaarde, Brenda; Schoen, Robert E.; Singhi, Aatur; Uttam, Shikhar

doi:10.1038/s42003-024-06714-4

Download PDF

Article
Open access
Published: 30 August 2024

UNSEG: unsupervised segmentation of cells and their nuclei in complex tissue samples

Communications Biology volume 7, Article number: 1062 (2024) Cite this article

6041 Accesses
6 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Multiplexed imaging technologies have made it possible to interrogate complex tissue microenvironments at sub-cellular resolution within their native spatial context. However, proper quantification of this complexity requires the ability to easily and accurately segment cells into their sub-cellular compartments. Within the supervised learning paradigm, deep learning-based segmentation methods demonstrating human level performance have emerged. However, limited work has been done in developing such generalist methods within the unsupervised context. Here we present an easy-to-use unsupervised segmentation (UNSEG) method that achieves deep learning level performance without requiring any training data via leveraging a Bayesian-like framework, and nucleus and cell membrane markers. We show that UNSEG is internally consistent and better at generalizing to the complexity of tissue morphology than current deep learning methods, allowing it to unambiguously identify the cytoplasmic compartment of a cell, and localize molecules to their correct sub-cellular compartment. We also introduce a perturbed watershed algorithm for stably and automatically segmenting a cluster of cell nuclei into individual nuclei that increases the accuracy of classical watershed. Finally, we demonstrate the efficacy of UNSEG on a high-quality annotated gastrointestinal tissue dataset we have generated, on publicly available datasets, and in a range of practical scenarios.

Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning

Article 18 November 2021

UnMICST: Deep learning with real augmentation for robust segmentation of highly multiplexed images of human tissues

Article Open access 18 November 2022

A novel deep learning-based 3D cell segmentation framework for future image-based disease detection

Article Open access 10 January 2022

Introduction

Recent innovations in highly multiplexed immunofluorescence imaging^{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15} have substantially increased the range of antigens that can be spatially profiled in a tissue sample, from 3–5 targets to ~60 (see ref. ¹⁶). Segmentation is a required step for quantitatively associating their spatial expressions with individual cells. Since 2012, when AlexNet¹⁷, a deep convolutional neural network (CNN), outperformed other methods in the ImageNet classification challenge, there has been a paradigm shift towards using CNN-based deep learning (DL) frameworks¹⁸ trained on curated datasets for cell and nucleus segmentation tasks^{19,20,21,22,23,24,25,26,27,28}. Among them, Cellpose²⁵—a DL method based on a U-Net architecture utilizing gradient flow representation of cells—and Mesmer²⁶—a DL method based on ResNet50 architecture—have demonstrated human-level performance in the highly multiplexed imaging context. However, due to their dependence on stochastic gradient descent and back-propagation-based optimization during the training step, it remains difficult to identify the contribution of each neuron to the eventual segmentation outcome, and as a consequence explain the source of errors in segmentation when they occur²⁹. As a result, improving performance of these black-box DL models requires rewiring the input–output mapping via training on additional datasets³⁰. However, in complex tissue samples with considerable heterogeneity and ambiguity in cellular organization, it is unclear whether retraining alone will consistently improve results across all samples, or if multiple DL models need to be constructed and used through a trial and error approach, with the hope that their performance will optimally generalize. Curation of accurately annotated datasets of sufficient quality that capture the tissue microenvironment diversity also remains a critical challenge.

In contrast to DL approaches, most unsupervised cell segmentation methods^{31,32,33,34,35,36,37,38,39,40,41,42,43} do not require training data, are explainable, and therefore where needed, can be optimized for individual images. However, to the best of our knowledge, to date no unsupervised segmentation method capable of approaching DL method performance has been reported in the literature. Here, we present a new unsupervised segmentation algorithm (UNSEG) capable of performing sub-cellular segmentation of tissue sample images with accuracy on par with state-of-the-art DL segmentation approaches such as Cellpose and Mesmer. UNSEG achieves this performance in two stages. At the first stage, UNSEG quantifies the intrinsic contrast provided by any nucleus and cell membrane-specific markers at the local and global scale, and jointly exploits it to assign each pixel to the nucleus, cell membrane, or the background class. This pixel assignment is implemented with the help of a Bayesian-like framework that computes a priori distributions and an image contrast-based likelihood function to estimate the posterior probabilities of each pixel belonging to the nucleus, cell membrane or background classes. UNSEG uses the posterior probabilities to assign the pixel to the correct compartment. At the second stage, it parses the semantic pixel assignments into topologically consistent nuclei and cells. Towards this goal UNSEG introduces a perturbed watershed algorithm to correctly partition a nucleus cluster into individual nuclei. The final output of UNSEG are nucleus and cell segmentations corresponding to the input image.

We have curated a labeled gastrointestinal tissue (GIT) dataset comprising of diverse images of gastrointestinal tissue to benchmark UNSEG performance. We anticipate that this dataset will also be useful to DL researchers and the broader research community and help ameliorate the shortage in annotated imaging datasets³⁰. We have also tested UNSEG performance on public datasets, with images drawn from diverse tissue types and diseases beyond the gastrointestinal system, that have been labeled with different nucleus and cell membrane markers and acquired at different magnifications and resolutions. In addition, we also demonstrate applicability of UNSEG in a variety of real-world cases that include, weakly expressing markers, non-specific markers, different nucleus markers, and multiplexed ion beam imaging (MIBI). In the context of these diverse scenarios, we also discuss how quantification of segmentation accuracy can potentially be biased depending on the nature of deviation of segmentation mask from the ground truth. Finally, we note that since UNSEG does not require any training data to segment tissue images, it can be used to generate high-quality segmentation of unlabeled tissue images, which is majority of the data in real-world settings, as optimized initial estimates for improving DL models within unsupervised and semi-supervised settings. UNSEG, therefore, is an easy-to-use method for unsupervised sub-cellular segmentation of images of complex tissue samples that does not require extensive setup and performs on par with state-of-the-art DL methods. It also has the potential to improve the state-of-the-art in deep learning.

Results

UNSEG principle and design

Segmenting cells and nuclei in 2D images of tissue samples is challenging because of their complex morphology, ambiguous overlaps, and heterogeneity in the spatial distribution of nucleus and cell membrane markers within each cell. In the morphological context, although cells and their nuclei exhibit an overall convex topology, they locally deviate from it to varying degrees depending on cell types, and particularly in tumors with irregularly shaped cancer cells. In addition, many cells in a tissue-dependent manner are clumped in clusters where their shape and overlap is difficult to parse. Cells in tissues also exhibit uneven intra-cellular distribution of marker expression. Together, these degrees of complexity make it difficult to consistently segment cells and nuclei using unsupervised segmentation approaches such as classical watershed^31,32,38, shape and intensity prior^{36,37,39,40,41}, and tracking of diffused gradient flow^33,34, which have primarily been developed for segmenting cells in culture that lack tissue associated heterogeneity related to cellular morphology, expression, and overlap. UNSEG framework overcomes these limitations by jointly exploiting the expression-based topology and distribution of markers specific to nuclei and cell membranes (Fig. 1). Such markers are also used in the supervised context of DL methods, such as Cellpose and Mesmer.

UNSEG combines a priori probability of each image pixel belonging to a nucleus or cell membrane (Fig. 1a) with a contrast-based likelihood function (Fig. 1b), to compute a posteriori semantic segmentation of image pixels into nucleus and cell membrane (Fig. 1c). UNSEG performs this segmentation both at the global level of the entire image, and at the local level in a neighborhood around each pixel (Fig. 1c). The local segmentation captures the local heterogeneity in nucleus and cellular morphology, while the global segmentation ensures that the overall topological structure of the nuclei and cell membranes is preserved across the entire image. The final step of UNSEG utilizes these local and global nucleus and cell semantic masks to obtain instance segmentation of individual nuclei (Fig. 1d) and cells (Fig. 1e). This step includes partitioning nucleus clusters into individual nuclei based on convexity analysis, perturbed watershed and its ancillary function we refer to as virtual cuts. The latter two are briefly described below. The details of each step are described in “Methods”.

Perturbed watershed

Classical watershed-based segmentation^44,45 identifies individual nuclei in a cluster as watersheds, with each watershed basin representing a nucleus in the cluster. However, heterogeneity in the spatial distribution of nucleus marker can make it difficult to uniquely identify the individual basins. Cellpose overcomes this problem in the supervised context by developing a gradient flow field representation of each nucleus whose ground truth is annotated by a human user²⁵. This representation provided a stable and unique representation of nucleus basins. In the unsupervised context, we have developed a perturbed watershed approach (Fig. 2 and “Methods”), where the initial watershed-based segmentation (Fig. 2i) of the nucleus cluster into individual nuclei is perturbed (Fig. 2j–m) based on an adaptive distance-transform estimate (Fig. 2h) computed from the global nucleus cluster (Fig. 2d), and local topology of the cell membrane network (Fig. 2e). Nuclei that are correctly segmented remain stable to the perturbations, while spuriously segmented nuclei collapse to a point-like object with area not exceeding a few pixels. When applied recursively, perturbed watershed partitions the nucleus cluster into individual nuclei. An example of a two-nuclei cluster is shown in Fig. 2. Initial watershed partitions the cluster into three nuclei (Fig. 2i), one of which shrinks to a point object on perturbation of the watershed seed point. The perturbation is performed in four directions: up, down, left, and right. In this example, the unstable nucleus collapsed for three of those perturbations (up, down, and left), indicating that the seed point is unstable and the corresponding segmentation is a spurious nucleus. Therefore, it is removed and the correct watershed-based segmentation (Fig. 2n) is obtained using the two remaining stable seed points and the original distance transform (Fig. 2g). We note that the perturbed watershed algorithm does not make any assumptions specific to the fluorescence-based imaging modality. It is, in fact, agnostic to the imaging modality and can be used to improve classical watershed results, wherever the latter method is applicable.

Virtual cuts

In some cases, mostly when cell membrane marker is not present, the initial watershed segmentation step might undersegment the cluster. For such cases, we have developed the virtual cuts method that utilizes non-convex topology of the cluster to identify nuclei centroids that act as seed points for the watershed algorithm. See “Methods” for implementation details.

New dataset for benchmarking segmentation performance

As part of our UNSEG development, we have curated 75 tiff images of tissue sections from eight organs of the extended human gastrointestinal system—appendix, colon, esophagus, gallbladder, liver, pancreas, small intestine, and stomach. The immunofluorescence images were acquired via imaging of formalin-fixed paraffin-embedded (FFPE) tissue sections labeled using Hoechst and fluorescent-dye-conjugated Na⁺K⁺ATPase as respective markers for cell nuclei and membranes (see “Methods”). The image dimensions are 1000 × 1000. The images were acquired using a 0.95 numerical-aperture objective with 40× magnification, and have a pixel pitch of 0.16 μm/pixel. Our gastrointestinal tissue (GIT) dataset includes images of normal tissues as well as tissues related to chronic inflammation, cancer precursor lesions, and cancer. These images capture a wide range of tissue organization from samples with sparsely located cells to those with very high cell density. Figure 3 shows 12 representative images from the GIT dataset.

**Fig. 3: Gastrointestinal tissue (GIT) dataset.**

Expert pathologists independently annotated the 75 images resulting in ground truth with 16,201 nuclei and 16,217 cells. These annotations were performed manually, without any algorithmic aid, to truly reflect human performance. The detailed description of the dataset is presented in Supplementary Table 1 and Supplementary Fig. 1, while the nuclei and cell annotations of 12 representative images are shown in Supplementary Fig. 2. To annotate nuclei and cells in the 75 images, we developed Cellthon—a Python-based graphical user interface for annotating cells and their nuclei in tissue images.

We used the GIT dataset to benchmark UNSEG performance. Moreover, we anticipate that this dataset will also serve as a resource for researchers requiring annotated datasets for future algorithm development and testing³⁰.

UNSEG benchmarking using GIT and publicly available datasets

We used GIT and publicly available datasets to benchmark the segmentation performance of UNSEG with respect to Cellpose and Mesmer, the two state-of-the-art DL methods that have consistently demonstrated good performance in segmenting immunofluorescence imaging data particularly in the context of highly multiplexed imaging^25,26. To perform the comparison with Cellpose, we used Cellpose version 2.1.0. In this version, we chose nuclei and TN2 models from the Cellpose “model zoo” to respectively segment nuclei and cells. Our choice was based on them giving the best segmentation results for the GIT dataset in comparison to all other Cellpose models. We used Cellpose size calibration procedure to estimate the cell diameter for each of the 75 images in our dataset. We also chose Mesmer model, DeepCell 0.12.6, and set the model parameter image_mpp to the pixel pitch in microns per pixel for our imaging dataset. Benchmarking was performed by computing the F₁ score (Eq. (7)) as a function of intersection over union (IoU) threshold⁴⁶. The IoU threshold metric quantifies the degree of overlap between algorithm prediction and the annotated ground truth. It is bounded between 0 and 1, with one indicating perfect overlap. By computing the F₁ score over the IoU range, we obtain the F₁ accuracy curve for each method (see “Methods” for more details).

Figure 4a shows UNSEG, Cellpose, and Mesmer segmentation results applied to four representative examples from our 75 image GIT dataset. Visual comparison shows similar performance between the different methods. One difference between UNSEG and the other two methods is that, although, UNSEG does implement boundary smoothing, it does not enforce strict shape constraints. As a consequence, the shape of UNSEG-based nucleus and cell segmentation is more irregular but also more realistic and less synthetic appearing than Cellpose and Mesmer.

**Fig. 4: Comparison of UNSEG, Cellpose, and Mesmer on four example images from the GIT dataset.**

The F₁ curves for the four examples (Fig. 4b) demonstrate that UNSEG performance is similar to that of the DL methods trained on about a million cells. The ground truth annotations for these four examples are shown in Supplementary Fig. 2.

The similarity in their performance on the four example images generalizes to the entire GIT dataset. The results are shown in Fig. 5. The first row depicts the median F₁ curves corresponding to nucleus and cell segmentation by the three methods. The curves indicate that the three methods have similar segmentation performance. For cell segmentation, the median UNSEG performance is slightly below the other two methods, which is partly due to the conservative nature of UNSEG cell segmentation in resolving cell boundary ambiguity in cases where the tissue section capture partial cell membranes without their respective nuclei. In these cases, UNSEG does not always include their segmentation masks in the final results. (Also see, “F₁ score and accuracy” section below.) Nevertheless, if we look at the pairwise 95% F₁ confidence interval comparison between UNSEG performance, with Cellpose and Mesmer—the second and third rows of Fig. 5 respectively—we clearly see their almost complete overlap, indicating their overall similar performance. A more detailed version of Fig. 5 is presented in Supplementary Fig. 3. We note that we used the same UNSEG parameters to segment all 75 images in the GIT dataset and did not optimize them for every image, despite this ability being a strength of UNSEG and would have boosted its performance. The rationale for eschewing this adjustment was to demonstrate that our probabilistic reinterpretation of the two-channel image through a Bayesian lens provides UNSEG with robustness and performance stability, and prevents it from being brittle and requiring continuous adjustment. We additionally note that this is unlike our characterization of Cellpose performance, where we adjusted its size parameter for every image. Therefore, our performance curves are biased towards Cellpose. The UNSEG parameter values we used for GIT dataset are listed in Supplementary Table 2 and discussed in “Methods”.

**Fig. 5: Performance comparison of UNSEG, Cellpose, and Mesmer for the entire GIT dataset.**

Furthermore, we also benchmarked the segmentation performance of UNSEG with respect to Cellpose and Mesmer using publicly available, multiplexed imaging tissue datasets acquired using CODEX, Vectra, and Zeiss imaging platforms^47,48. Supplementary Figs. 4–6, respectively, show the cell segmentation performance of UNSEG, Cellpose, and Mesmer on CODEX, Vectra, and Zeiss datasets. The Codex dataset comprises of ten 400 × 400 images of lymph nodes and tonsils. For our benchmarking, we chose CD20 and CD45RO as cell membrane markers to demonstrate the ability of UNSEG to work with different cell membrane markers. These images were acquired using an objective with 20× magnification, and imaging sensor with pixel pitch of 0.3774 μm/pixel^47,48. Supplementary Fig. 4a depicts an example image of lymph node from the CODEX dataset, along with its ground truth cell annotation, the cell segmentation predicted by UNSEG, Cellpose, and Mesmer, and their corresponding F₁ score-based performance curves. Due to the high cell density, lymph node samples are typically difficult to segment. This example provides a clear visual and quantitative demonstration of UNSEG performing segmentation on par with Cellpose and Mesmer. Supplementary Fig. 4b further shows that the quality UNSEG performance extends to the entire CODEX dataset.

Similarly, Supplementary Figs. 5 and 6 compare the performance of UNSEG cell segmentation with that of Cellpose and Mesmer for Vectra and Zeiss datasets^47,48, respectively. The Vectra dataset includes 131 tissue images of size 400 × 400 from a range of pathologic diseases that include lung adenocarcinoma, extramammary Paget disease, pancreatic ductal adenocarcinoma, lung small cell carcinoma, colon adenocarcinoma, Hodgkin lymphoma, breast ductal carcinoma, serous ovarian carcinoma, squamous cell carcinoma, Merkel cell carcinoma, and squamous mucosa. The Zeiss dataset consists of nineteen tissue images of size 800 × 800, acquired from tissue sections of cutaneous T-cell lymphoma, pancreatic adenocarcinoma, basal cell carcinoma, and melanoma. Both Vectra and Zeiss datasets were acquired using 20× magnification objectives however pixel pitches of imaging sensors were 0.5 μm/pixel and 0.325 μm/pixel respectively^47,48. Although, UNSEG performs stable and high-quality segmentation, faithfully capturing cell shapes, its F₁ score-based performance is upper bounded by Cellpose and Mesmer. This is partly due to the tendency of the annotated ground truth to have on average smaller cell size, when compared to Cellpose and Mesmer estimates, which tends to favor their F₁ scores (also see, “F₁ score and accuracy” section below). We found this to be particularly true for Vectra dataset. For this dataset, it was also difficult to find cell membrane markers that were appropriately imaged across the different images. We, therefore, utilized pan-cytokeratin, a cytoplasmic marker for cell segmentation. Since, UNSEG has been developed for utilizing nucleus and cell membrane marker for unsupervised segmentation, and not nucleus and cytoplasm marker, we did expect reduced performance. However, the quality of UNSEG segmentation remained remarkably robust, despite the expected reduction in UNSEG F₁ score values.

Applicability of UNSEG to different practical scenarios

We also tested UNSEG performance in multiple different practical scenarios.

1.
Weakly expressing cell membrane marker: We identified a tissue image of human skin with dermatofibrosarcoma acquired from a publicly available CODEX dataset¹³, which is a different dataset from the one discussed above. This image has weakly expressing Na⁺K⁺ATPase as the cell membrane marker. Hoechst is the nucleus marker. The image size is 1440 × 1440 pixels. It was acquired using an objective with a 20× magnification and a sensor with a pixel pitch of 0.377 μm/pixel. As shown in Supplementary Fig. 7, UNSEG demonstrates stable and robust segmentation performance with a weakly expressing membrane marker. As this dataset lacked annotations, we did not compute the F₁ curve but as the figure demonstrates, a visual, qualitative assessment of UNSEG segmentation compares favorably with Cellpose and Mesmer.
2.
Using a non-specific cell membrane marker to segment cells: In Supplementary Fig. 5, using the Vectra dataset, we demonstrated that UNSEG is robust to using cytoplasmic markers for cell segmentation. To further test the wide applicability of UNSEG, we replaced weakly expressing Na⁺K⁺ATPase with Hyaluronan, which cannot only localize to the cell membrane but also to the cytoplasm and the extracellular matrix. We used Hoechst as the nucleus marker. Supplementary Fig. 8 shows that UNSEG performs high-quality nucleus and cell segmentation, which also compares favorably with generalist methods like Cellpose and Mesmer.
3.
DRAQ5 as the nucleus marker: We next switched Hoechst with DRAQ5 as the marker for the nucleus, while keeping Hyaluronan as the cell membrane marker. Supplementary Fig. 9, show that UNSEG continues to provide high-quality segmentation.
4.
Applying UNSEG to multiplexed ion beam imaging (MIBI): We also tested UNSEG sub-cellular segmentation performance on nuclei and cells in a placental tissue image acquired using MIBI, an alternative multiplexed imaging technology^6,8. The image was downloaded from the Human BioMolecular Atlas Program (HuBMAP) database.⁴⁹ The image size is 2048 × 2048, with pixel pitch of 0.391 μm/pixel. Due to lack of clearly identified annotation, Supplementary Fig. 10 does not show the F₁ curves, but does provide a visual comparison of UNSEG, Cellpose and Mesmer performance. As before, UNSEG performance continues to be at par with deep learning methods.

F ₁ score and accuracy

F₁ is a well-established score for assessing segmentation accuracy. It simultaneously accounts for the proportion of correctly segmented objects and their pixel-wise matching with ground truth object profiles⁴⁶. However, as we show in Supplementary Fig. 11, F₁ score is biased depending on how the estimated segmentation mask deviates from the ground truth. Specifically, F₁ value is higher if the size of the estimated segmentation mask is larger than the ground truth, as compared to when it is smaller. In fact, as shown in Supplementary Fig. 11, the former upper bounds the latter. Both Cellpose and Mesmer, on average, have larger cell segmentation mask estimates when compared to UNSEG. This is a contributory factor towards the higher median F₁ scores for Cellpose and Mesmer, even when segmentation results from all three methods are reasonable. Supplementary Fig. 4 exemplifies this point. There, even though cell segmentation results from all three methods are reasonable, UNSEG has a slightly lower F₁ curve, due to it being conservative in estimating cell size, as is discussed above in the subsection on UNSEG benchmarking.

UNSEG characteristics and use case

UNSEG employs an integrated approach to segmenting nuclei and cells that, by design, emphasizes internal consistency between each cell nucleus and its membrane. As a consequence, UNSEG guarantees that no segmented nucleus can be located beyond the boundaries of its cell. This drawback is often found in both Cellpose and Mesmer, where nucleus and cell segmentations are performed independently. Figure 6a depicts a small intestine tissue section illustrating the internal inconsistency in nucleus and cell boundaries estimated by Cellpose and Mesmer for a pair of examples highlighted with dashed boxes. In the case of Cellpose the larger nucleus is located in two cells, while in Mesmer, for region marked as 1, two cells are sharing the same nucleus. For region marked as 2, in the case of Cellpose the nucleus extends beyond the boundary of its cell. UNSEG avoids such discrepancies due to its joint segmentation of nuclei and cells. This joint processing ensures that UNSEG can unambiguously identify the cytoplasmic compartment of cells. The internal consistency among sub-cellular compartments is of particular importance in biological studies where correct sub-cellular localization of signaling pathway components is essential to study intra-cellular signaling. For example, tumor protein P53 can be sequestered in the cytoplasm, or localized in the nucleus depending on DNA damage, and other exogenous and endogenous stresses. However, in unstressed cells, it is expressed at low levels and localizes in both the cytoplasm and the nucleus⁵⁰. As another example, histone methyltransferase EZH2 localizes in the nuclei, where it regulates gene expression through its canonical histone lysine methyltransferase activity⁵¹. Supplementary Fig. 12 depicts an example of such a real use case, where UNSEG is used in a multiplexed imaging context to segment cells and their nuclei based on Hoechst and Na⁺K⁺ATPase. The UNSEG-based segmentation is used to localize intra-cellular P53 and EZH2 expression in a region of healthy colon tissue with densely located cells (see “Methods”). The internal consistency of UNSEG segmentation ensures that the user is correctly able to evaluate P53 expression in the nucleus and the cytoplasm, while ensuring that the canonical activity of EZH2 in the healthy tissue is not associated with the cytoplasm.

**Fig. 6: Characteristics of UNSEG method.**

As briefly mentioned earlier, UNSEG does not impose a strict shape constraint on the segmented nuclei by allowing them to be locally non-convex. Consequently, in complex tissue sections it is, on average, better at preserving true nucleus shape than Cellpose and Mesmer, which either are usually more rounded, and in regions of the tissue with high cell density, appear like Voronoi partitions of the tissue region. Figure 6b shows an example of pancreas tissue with elongated cells that deviate from round shapes. As can be seen, the ability of UNSEG to combine knowledge of global tissue architecture and local topology, with a relaxed shape constraint allows it to better capture elongated nucleus morphology when compared to Cellpose and Mesmer. This ability is highly relevant in the context of the use case mentioned above, where users, such as cancer biologists are studying the tumor microenvironment that might include a diversity of cell shapes associated with cancer, immune, and stromal cell populations.

Runtime complexity of UNSEG is a function of number of cells and not the image size. Specifically, UNSEG runtime complexity scales approximately linearly with respect to the number of segmented cells in the image. This translates to linear scaling with respect to image area, if the spatial distribution of cells is approximately uniform. However, for sparsely populated images UNSEG runtime will be significantly sub-linear. Figure 6c shows linear dependence with respect to the number of segmented cells and the image area, under the assumption of uniform cell distribution. The results were generated using an acquired colon tissue microarray (TMA) spot with approximately uniform cell distribution. The segmentation results for the whole TMA spot are presented in Supplementary Fig. 13.

Discussion

The importance of segmenting cells and their nuclei has gained renewed prominence due to the advent of multiplexed imaging technologies that have significantly enhanced the depth of information that can potentially be extracted from samples in a cell-specific manner. However, tissue sections have complex cell organizations and unlike computer vision tasks, segmenting individual cells even by human experts is a difficult challenge, resulting in inter-observer discordance. Such discordance usually grows as the number of cells requiring annotation grows. This, in turn, affects ground truth quality used to train supervised learning models, and is a bottleneck for generating high-quality training data. The unsupervised approach provides a complementary paradigm to segmenting complex tissue images without requiring training data. Unsupervised methods are also more adaptable to individual images of varying complexity. However, to the best of our knowledge, until now no method within the unsupervised paradigm had demonstrated performance approaching supervised learning methods, particularly those based on deep learning. As a consequence, none of its advantages were relevant. UNSEG, for the first time, to the best of our knowledge, demonstrates that unsupervised cell and nuclei segmentation can achieve accuracy at par with the current state-of-the-art methods in deep learning. It also introduces the perturbed watershed algorithm, a standalone algorithm that extends the ability of classical watershed algorithm to correctly segment nucleus clusters. Perturbed watershed is applicable in all cases where the classical version can be used. Finally, like the generalist DL methods, UNSEG is not brittle, and is applicable to a range of tissue types, disease pathologies, nucleus and cell membrane markers, and multiplexed imaging modalities. It achieves accuracy on par with these methods, along with the added benefit of guaranteeing segmentation consistency between a cell and its nucleus, and being faithful to their morphology. These latter benefits can potentially be helpful in accurate sub-cellular localization of mRNA transcripts in microscopy images generated using well-established protocols for fluorescence in-situ hybridization⁵² and its multiplexed counterparts^{53,54,55,56,57} when combined with nucleus and membrane fluorescence markers.

Segmentation fundamentally involves learning features and image representations that help the algorithm identify individual cells and their nuclei. Deep learning models extract these features and representations in a supervised manner. Interestingly, UNSEG performance reveals that there is intrinsic information latent in the topology of cells and nuclei within the tissue context of an individual image that is equivalent to training on one million cells²⁶. Importantly, this information can be acquired adaptively for every tissue image. Therefore, it is conceivable to develop adaptive DL methods that perform sub-cellular segmentation of individual unlabeled tissue images adaptively, by leveraging UNSEG as a label generator to initialize internally consistent cell and nucleus labels that a DL method can optimize and improve using self- and semi-supervised learning paradigms. For example, in a self-supervised learning framework UNSEG could be used to optimally initialize joint learning of neural network parameters and k-means-based segmentation of cells and nuclei⁵⁸. Another application could be in a semi-supervised setting, where a small portion of the image is annotated, while the remaining is unlabeled. Here, UNSEG could be used to provide pseudo-labeling estimate of cell and nucleus segmentation for the unlabeled data, which can then be used to refine the DL model trained on labeled data^59,60. Finally, UNSEG could be used in the setting of learning with noisy labels, where the UNSEG generated segmentation masks are noisy labels on which robust DL models can be trained⁶¹.

UNSEG performs sub-cellular segmentation based on nucleus and cell membrane compartment markers. However, its framework does not impose any constraint on the number of markers that can be used. For example, in multi-nucleated cells, UNSEG can be modified to incorporate an additional marker specific to the nuclear membrane to coherently segment multiple overlapping nuclei belonging to the same cell. Supplementary Fig. 14 depicts an example of a multi-nucleated cell, with Lamin A/C (shown in green) marking the nucleus membranes. As depicted in this figure, the modification of UNSEG utilizes the specificity of the extra marker to segment the nuclei and associate them with the same cell.

UNSEG is an easy-to-use method for sub-cellular segmentation of complex tissue images using multiplexed imaging technologies. It only uses well-known and robust Python libraries that require minimal setup and is accessible to researchers with varying computational backgrounds. In total, UNSEG has thirteen parameters (see Methods, Supplementary Table 2, and the code implementation), all with clear meaning and interpretation, and assigned default values for images having a pixel pitch of 0.16 μm/pixel. Among them, minimal area and convexity threshold are the two primary parameters (see “Methods” and Supplementary Table 2) that have the strongest effect on UNSEG execution. They can be adjusted by the user to optimize segmentation performance for individual images including relatively large images as shown in Supplementary Fig. 13. However, as we demonstrated using the GIT, CODEX, Vectra and Zeiss datasets, a single setting of these two parameters can also be used across an entire cohort of images with the same pixel pitch, without noticeably compromising segmentation quality. The user can also define the expected cell size via the dilation radius (u₀) parameter. UNSEG uses this parameter only for cells without cell membrane marker expression. This parameter does not affect execution of the core UNSEG method. Two other parameters, disk radius (r₀) and the kernel-size list (n₀) can also be customized to more accurately account for local background noise and pixel pitch. Examples based on such customization are shown in Supplementary Figs. 4–10. The remaining parameters only marginally affect UNSEG segmentation quality, but if needed, can be used to further fine-tune UNSEG performance. The default values and reasonable adjustment ranges for all parameters are listed in Supplementary Table 2. UNSEG, therefore, is a flexible framework that can also be extended to include additional markers to enhance cell segmentation and to extract localized expression of individual markers across the tissue sample. Finally, we re-emphasize that unlike segmentation of objects in computer vision-based situational awareness tasks, segmenting cells and their nuclei, particularly in the context of tissue samples, often results in subjective ground truth. By being able to capture intrinsic, marker-specific topological structure of cell compartments, UNSEG offers opportunities to further improve current state-of-the-art deep learning methods. To aid in this task, we have also generated a GIT dataset of 75 tissue images from eight organs of the human gastrointestinal system, along with their corresponding nucleus and cell annotations independently generated by expert pathologists.

Methods

Generation of GIT dataset and other images

For GIT dataset, formalin-fixed paraffin-embedded (FFPE) tissue microarray (TMA) slides were obtained from Pantomics (Pantomics, DID381) Tissue TMA samples for Supplementary Figs. 12–14 were obtained from the Department of Pathology at University of Pittsburgh Medical Center Presbyterian Hospital. The slides went through cyclic immunofluorescence antigen retrieval protocol¹⁰. The corresponding figure slides were stained in cycles with 1:200 dilution of Anti-Sodium Potassium ATPase antibody (Abcam ab198367, clone EP1845Y), 1:100 dilution of P53 antibody (Abcam ab270192, clone SP5), 1:50 dilution of EZH2 antibody (CST 45638, clone D2C9), and 1:100 dilution of ${{{\rm{LAMIN}}}}\,{{{\rm{A}}}}/{{{\rm{C}}}}$ antibody (CST 8617, clone 4C11) overnight at 4 °C in the dark, followed by staining with Hoechst 33342 (CST 4082S) for 10 min at room temperature in the dark. TMA images were acquired using a 0.95 NA and a 40× objective on a Nikon Ti2E microscope.

Seventy-five, 1000 × 1000 high-quality regions were identified and extracted from the TMA images and saved as tiff images. Expert pathologists independently annotated these images. The annotations were done using Cellthon, a Python-based cell annotation graphical user interface (GUI) we created using Tkinter toolkit⁶². Together these 75 images and their cell and nucleus annotations comprise the GIT dataset.

UNSEG algorithm

Input image

The input to our algorithm is a two-channel image. An example is illustrated in the “input" panel of Fig. 1 and Supplementary Fig. 15, as well as in Fig. 3 and Supplementary Fig. 13. Channel one, depicted in blue, and channel two shown in red, are respectively associated with nucleus and cell membrane marker expressions. Each channel of the image is independently scaled to 0 and 1, such that I_i: Ω → [0, 1]. Here I_i is the normalized image intensity for ith channel, Ω is the image domain, and i = 1, 2 is the indexing representing the two channels.

The algorithm performs nucleus and cell segmentation utilizing a Bayesian framework: the posterior probability estimates of nucleus and cell masks are obtained from their a priori and likelihood estimates that UNSEG computes from the normalized two-channel image. These posterior estimates are then used to obtain the final nucleus and cell segmentations. UNSEG implements this framework through four processing stages detailed below and illustrated in Fig. 1 and Supplementary Fig. 15.

Processing stage 1: computing a priori nucleus and cell membrane masks

In Stage 1, we compute a priori estimates of the image foreground for each channel. The estimates are computed at the global and local scale as described below.

A priori probability: Each channel, I_i(x, y), i = 1, 2, is first pre-processed using a combination of a Gaussian filter⁶³ and multi-level Otsu^63,64,65. The standard deviation of the Gaussian filter kernel, σ is a parameter of the algorithm that allows the user to control the degree of smoothing. This and other algorithm parameters are summarized in Supplementary Table 2. Our default setting is σ = 3. A three-level Otsu is next applied to the smoothed image, and the lowest level is selected as the threshold to obtain the initial estimate of the channel foreground.

We use the initial, per-channel foreground estimate to compute the cumulative distribution function (CDF), ${{{{\mathcal{F}}}}}_{i}$ of I_i using intensity values, I_i(x, y), of pixels (x, y) within this estimate. Two examples of CDFs are presented in Supplementary Fig. 15. Using the monotonically non-decreasing property of CDF, we map I_i to its cumulative probabilistic representation ${P}_{i}^{e}$, where ${P}_{i}^{e}(x,y)={{{{\mathcal{F}}}}}_{i}\left({I}_{i}(x,y)\right)$. We define ${P}_{i}^{e}(x,y)$ to be the a priori probability of the pixel being the nucleus (i = 1) or cell membrane (i = 2). We note that this definition quantifies the intuition that stronger the marker intensity at a particular pixel, the higher its a priori probability. Examples of a priori probabilities for nuclei (${P}_{1}^{e}$) and cell membranes, (${P}_{2}^{e}$) are presented in Fig. 1 and Supplementary Fig. 15.

A priori global mask: We compute the a priori global mask ${M}_{i}^{g}(x,y)$, i = 1, 2 using ${P}_{i}^{e}$ and a simple filter called local mean suppression filter (LMSF) that we have developed. The foreground pixels (x, y) where ${M}_{i}^{g}(x,y)=1$ are designed to be a superset of the pixels belonging to the true nucleus (i = 1) and cell membrane (i = 2) compartments of cells in I_i(x, y), i = 1, 2. ${M}_{i}^{g}$, therefore, ensures that no pixels belonging to the cells are missed.

LMSF is designed to identify the valleys (or space) that exist between nuclei (or cell membranes) of closely located cells that nevertheless have some spillover marker expression, and are therefore, difficult to identify as background. We define LMSF as,

$${\hat{I}}_{i}(x,y)= \left\{\begin{array}{ll}0,\quad &{{{\bf{if}}}}\,\,\frac{{I}_{i}(x,y)}{{\bar{I}}_{i}(x,y)} < \, {t}_{0}\\ {I}_{i}(x,y),\quad &{{{\bf{otherwise}}}}\end{array}\right.,\,\, \\ {{{\rm{where}}}}\,\,{\bar{I}}_{i}(x,y)= \frac{1}{{\left(2{n}_{0}+1\right)}^{2}}{\sum}_{\xi =x-{n}_{0}}^{x+{n}_{0}}{\sum}_{\eta =y-{n}_{0}}^{y+{n}_{0}}{I}_{i}(\xi ,\eta ).$$

(1)

The above definition states that for a given pixel (x, y) ∈ Ω, LMSF replaces the original intensity value with 0 only if the ratio of the pixel intensity to the average intensity, computed locally around the pixel neighborhood, is below the threshold parameter t₀. The size of the kernel defining the neighborhood over which the local mean intensity is computed is parameterized by n₀. We set t₀ = 0.5. Consequently, all pixels with intensity value less than half the mean intensity in their respective neighborhoods are replaced with zeros, allowing us to identify valleys between cells. By varying n₀ we can identify valleys and gaps of different widths. UNSEG performs LMSF filtering for n₀ = 5, 10, 20, 40. If ${\hat{I}}_{i}(x,y)=0$ for any value of n₀, then the final pixel value is set to 0 and assigned to be background in the global mask, ${M}_{i}^{g}(x,y)$. Thus, LMSF allows us to capture valleys of different widths. The values of n₀ are user-defined and can be optimized according to complexity of individual images.

We refine the global mask ${M}_{i}^{g}(x,y)$ by reassigning those pixels currently in the foreground that have a priori probability ${P}_{i}^{e}(x,y) \, < \, {p}_{i}$, i = 1, 2 to the background. This refinement is particularly useful for images with highly heterogeneous tissue with varying marker expression. The threshold value p_i should be small and by default is set to 0.01.

An example of a priori global mask is presented in Supplementary Fig. 15.

A priori local mask: Complementing ${M}_{i}^{g}(x,y)$, we next compute ${M}_{i}^{l}(x,y)$, the a priori local mask corresponding to image I_i(x, y). ${M}_{i}^{l}(x,y)$ captures the local peculiarities of the compartments—nuclei or cell membranes – associated with their local structure and morphology.

First, I_i(x, y) is filtered by applying a single iteration of gradient adaptive smoothing (GAS)^45,66,

$${\tilde{I}}_{i}(x,y) = \frac{1}{{N}_{i}(x,y)}{\sum}_{\xi =-1}^{1}{\sum }_{\eta =-1}^{1}{I}_{i}(x+\xi ,y+\eta ){w}_{i}(x+\xi ,y+\eta ),\,\, \\ {{{\rm{where}}}}\quad {N}_{i}(x,y) = {\sum}_{\xi =-1}^{1}{\sum}_{\eta =-1}^{1}{w}_{i}(x+\xi ,y+\eta ),\\ {w}_{i}(x,y) = \exp \left[-\frac{{d}_{i}^{2}(x,y)}{2{k}_{0}^{2}}\right],\quad {d}_{i}(x,y)=\sqrt{{\left[\frac{\partial {I}_{i}(x,y)}{\partial x}\right]}^{2}+{\left[\frac{\partial {I}_{i}(x,y)}{\partial y}\right]}^{2}}.$$

(2)

This GAS-filtered image, ${\tilde{I}}_{i}(x,y)$ smooths the original image, I_i(x, y), while preserving the local variations within and around cell nuclei and membranes. The local neighborhood is defined via a 3 × 3 kernel, w_i, that also performs variation preserving smoothing. Here, variation is quantified via computation of local gradient and the degree of smoothing is controlled by k₀, which is an algorithmic parameter. Its default setting is 1.

To obtain ${M}_{i}^{l}(x,y)$, a two-level, local Otsu is applied to ${\tilde{I}}_{i}(x,y)$ based on disk kernel whose radius r₀ is an algorithmic parameter. Its default setting is 5 pixels. The Otsu output faithfully captures the local structure but is also noisy, particularly in image regions where no tissue samples are present and the gradients are being computed on the background noise. As ${M}_{i}^{g}(x,y)$ can accurately identify such background, the output of the local Otsu is restricted to where ${M}_{i}^{g}(x,y)=1$, resulting in local foreground mask ${M}_{i}^{l}(x,y)$.

An example of a priori local mask is presented in Supplementary Fig. 15.

Processing stage 2: computing a posteriori nucleus and cell membrane masks

The a priori global and local binary masks are computed independently for both channels. As a result, non-negligible probability exists for a pixel to be classified as being both in the nucleus and cell membrane. This is particularly true in tissue regions with crowded cells, or when the nature of the tissue section is such that cell membrane is laying over the nucleus. This processing stage reconciles these overlaps and generates a posteriori global and local nucleus and cell membrane masks.

Contrast-based likelihood function: Human visual perception of cell membranes and nuclei is based on inherent contrast between the two channels. Usually this contrast is visualized via imbuing the individual intensity-based channels with colors. Here, we adapt this notion to compute a visual contrast function based on nucleus and cell membrane marker-specific expression to quantify the likelihood of pixel belonging to either the nucleus or cell membrane. The first step computes the contrast function for each pixel in the a priori local mask as follows,

$${L}_{0}(x,y)=\left\{\begin{array}{ll}\frac{{I}_{2}(x,y)-{I}_{1}(x,y)}{{I}_{2}(x,y)+{I}_{1}(x,y)},\quad &{{{\bf{if}}}}\,\,{I}_{1}(x,y) \, > \, {i}_{1}\,\,{{{\bf{or}}}}\,\,{I}_{2}(x,y) \, > \, {i}_{2}\\ 0, \hfill \quad &{{{\bf{otherwise}}}}\end{array}\right.,$$

where ${i}_{i}={\min }_{(x,y)\in {{{\Omega }}}_{i}}{I}_{i}(x,y),\,{{{\Omega }}}_{i}=\left\{(x,y)\in {{\Omega }}\,| \,{M}_{i}^{l}(x,y)=1\right\},\,i=1,2$. The second step ensures that this function is consistent with the a priori global mask for each channel, resulting in the contrast-based likelihood function,

$${{{\bf{L}}}}(x,y)=\left\{\begin{array}{ll}{L}_{0}(x,y),\quad &{{{\bf{if}}}}\,\,{L}_{0}(x,y) \, < \, 0\,{{{\bf{and}}}}\,{M}_{1}^{g}(x,y)=1\,\,{{{\bf{or}}}}\,\,{L}_{0}(x,y) \, > \, 0\,{{{\bf{and}}}}\,{M}_{2}^{g}(x,y)=1\\ 0, \hfill \quad &{{{\bf{otherwise}}}} \hfill \end{array}\right..$$

(3)

L(x, y) is bounded between [ − 1, 1], with the contrast of − 1 indicating the strong likelihood that the pixel (x, y) belongs to the nucleus, while 1 indicating the pixel most likely belongs to the cell membrane. Two examples of likelihood function are presented in Fig. 1 and Supplementary Fig. 15.

A posteriori global mask: We combine the a priori probability with the contrast-based likelihood function to compute the a posteriori global mask M^g(x, y), such that M^g : Ω → {0, 1, 2}, where the labels 0, 1, and 2 correspond to the background, nuclei, and cell membranes, respectively. However, before performing this combination, we enhance ${P}_{i}^{e}(x,y)$ as follows,

$$\begin{array}{r}{P}_{i}^{s}(x,y)=\left\{\begin{array}{ll}1,\quad &{{{\bf{if}}}}\,\,{M}_{i}^{l}(x,y)=1\\ {P}_{i}^{e}(x,y),\quad &{{{\bf{otherwise}}}}\end{array}\right.,\end{array}$$

(4)

where i = 1, 2. This enhancement, saturates ${P}_{i}^{e}(x,y)$ —that is, sets ${P}_{i}^{e}(x,y)=1$ —where the a priori local mask is 1. It ensures graceful performance of our algorithm in the global context, when computing a posteriori global mask M^g(x, y). We then compute the a posteriori global probability ${P}_{i}^{g}(x,y)$, via ${P}_{i}^{s}(x,y)$-weighted convex combination of the likelihood and a priori belief,

$$\begin{array}{r}{P}_{1}^{g}(x,y)=\left\{\begin{array}{ll}{P}_{1}^{s}(x,y)+\left(1-{P}_{1}^{s}(x,y)\right)\,| {{{\bf{L}}}}(x,y)| ,\quad &{{{\bf{if}}}}\,\,{{{\bf{L}}}}(x,y) \, < \, 0\\ 0, \hfill \quad &{{{\bf{otherwise}}}}\end{array}\right.,\\ {P}_{2}^{g}(x,y)=\left\{\begin{array}{ll}{P}_{2}^{s}(x,y)+\left(1-{P}_{2}^{s}(x,y)\right)\,| {{{\bf{L}}}}(x,y)| ,\quad &{{{\bf{if}}}}\,\,{{{\bf{L}}}}(x,y) \, > \, 0\\ 0, \hfill \quad &{{{\bf{otherwise}}}}\end{array}\right..\end{array}$$

(5)

The final posterior global mask is obtained by either applying k-means clustering, with k = 3, or argmax operation⁴⁵ on ${P}_{i}^{g}(x,y)$, i = 1, 2 (Eq. (5)) to compute M^g(x, y). The default setting is argmax. We note that k-means (or argmax) is performed under the constraint that pixel (x, y) ∈ Ω is assigned to the common background if both global probabilities have zeros values, i.e., ${P}_{i}^{g}(x,y)=0$, i = 1, 2. Examples of the a posteriori global mask are presented in Fig. 1 and Supplementary Fig. 15.

A posteriori local mask: We define the a posteriori local mask, M^l: Ω → {0, 1, 2}, simply by restricting the a priori probability ${P}_{i}^{e}(x,y)$ to the local mask ${M}_{i}^{l}(x,y)$,

$${P}_{i}^{l}(x,y)=\left\{\begin{array}{ll}{P}_{i}^{e}(x,y),\quad &{{{\bf{if}}}}\,\,{M}_{i}^{l}(x,y)=1\\ 0,\quad &{{{\bf{otherwise}}}}\end{array}\right.,$$

(6)

where i = 1, 2. This restriction allows us to optimally capture the local a posteriori structure of the nuclei and cell membranes in a self-consistent manner.

Similar to computing the a posteriori global mask, we either apply k-means clustering or argmax (default setting) operation on ${P}_{i}^{l}(x,y)$, i = 1, 2 (Eq. (6)) to obtain the a posteriori local mask M^l(x, y). As mentioned above for the a posteriori global mask, the same constraint for the common background is also applied here. Examples are presented in Fig. 1 and Supplementary Fig. 15.

Processing stage 3: nucleus segmentation

The a posteriori global and local masks provide a semantic segmentation of image pixels comprising the tissue into nuclei and cell membranes. This, and the following processing stages are designed to obtain every instance of individual nucleus and its cell from the semantic segmentation of the tissue. Specifically, in this stage, we first segment all nuclei, and use them as a basis to identify their cells in the next stage. These steps ensure that the nucleus and cell segmentations are internally consistent with the latter always bounding the former.

To segment nuclei we process the a posteriori global mask for the nuclei, ${{{{\bf{M}}}}}_{nuc}^{g}(x,y):= {{{{\bf{M}}}}}^{g}(x,y){| }_{{{{\rm{label}}}} = 1}$ with help from the a posteriori local mask for the cell membrane, ${{{{\bf{M}}}}}_{cell}^{l}(x,y):= {{{{\bf{M}}}}}^{l}(x,y){| }_{{{{\rm{label}}}} = 2}$. Particular examples of these two masks are presented in Supplementary Fig. 15.

Convexity analysis: Nucleus segmentation begins with convex analysis of every connected component of ${{{{\bf{M}}}}}_{nuc}^{g}(x,y)$. As a part of this analysis, we compute area and the steepest concave point (SCP)³⁷ of every component. SCP is a boundary point of the component with the largest deviation from its convex hull. The area parameter allows us to filter out exceedingly small objects that are not nuclei, while SCP helps us determine if the component is nucleus cluster (NC) or not. The component is kept for further analysis only if the area of the component exceeds a₀. Otherwise it is removed. Each component that passes the area threshold, is either classified as an NC or non-NC depending on whether SCP is above or below the threshold d₀. Both a₀—default set to 20 pixels—and d₀—default value is 4 pixels—are the primary algorithm parameters (Supplementary Table 2). The non-NC components are statistically analyzed to obtain the initial segmentation for all individual nuclei, along with a small component (SC) list comprising of small convex objects that we are less confident about being nuclei.

Convexity analysis of ${{{{\bf{M}}}}}_{nuc}^{g}(x,y)$, is illustrated in Supplementary Fig. 15.

Perturbed watershed and virtual cuts: We process the NC components using perturbed watershed (PW) and virtual cut (VC) algorithms that we have developed. Their goal is to partition the NC into individual nuclei.

PW steps are illustrated in Fig. 2. Briefly, the NC component mask (Fig. 2d) is first modified by ${{{{\bf{M}}}}}_{cell}^{l}$ (Fig. 2e). Specifically, cuts are introduced in the NC component mask where the local cell membrane is indicated in the ${{{{\bf{M}}}}}_{cell}^{l}$ spatially corresponding to the NC component (Fig. 2f). We next apply distance transform (DT) on the modified NC component and use the resulting DT image (Fig. 2g) to compute d_avr—the average of all non-zero DT values in the DT image. d_avr is used to threshold the distance transform to identify n sub-regions with large DT values indicative of interior of the sub-regions—putative nuclei—making up the NC splitting (Fig. 2h). Within every sub-region we identify a pixel with the maximal distance-transform value as the watershed seed point (marker) for that sub-region. We perform watershed segmentation of NC based on these n seed points to obtain our initial estimate of the nuclei comprising the NC (Fig. 2i). If these estimates are correct, then perturbing the markers does not affect segmentation of the NC. However, if the estimates are incorrect, then sub-region estimates are not stable on perturbation. We exploit this perturbation-based stability to identify the correct segmentation of the NC. Specifically, we perturb the marker location and recompute the watershed-based segmentation. The perturbations are implemented by shifting each watershed marker location sequentially in the horizontal and vertical directions by ± ⌊d_avr⌋, resulting in four perturbations: (x_j ± ⌊d_avr⌋, y_j) and (x_j, y_j ± ⌊d_avr⌋) with j = 1, …, n (Fig. 2j–m). Here, ⌊ ⋅ ⌋ stands for the floor function. If during any of the four scenarios, the size of any of the n putative nuclei collapses to a point object with an area size bounded to a few pixels (Fig. 2j, l, m), we deem them as unstable and remove their corresponding seed points from the list of n seed points, and recompute the watershed-based segmentation with the remaining seed points (Fig. 2n). If the segmentation results remain stable for all four shifts, then the estimate is considered correct. To ensure that each of the segmented sub-regions are indeed nuclei and not smaller NCs, we recursively perform convexity analysis and PW on each sub-region. An example of this recursion is illustrated in Supplementary Fig. 16.

The above recursive segmentation of an NC can sometimes result in a specific pathological situation, where the convex analysis identifies a sub-region as an NC, but PW does not segment it into sub-regions. For this specific scenario, we have developed the virtual cuts (VC) approach, where a virtual cut is defined through the SCP of the NC component mask to identify virtual sub-regions. We use “virtual” to emphasize that this cut and the resulting sub-regions are only used to identify their respective watershed seed points based on which we perform the actual segmentation. The hypothesis driving the VC method is based on the idea of PW method: although the locations of the respective watershed markers identified using virtual cuts might not exactly coincide with their true locations, they do represent a perturbed version of the true location. Thus, they yield stable and accurate segmentation into the two sub-regions. These sub-regions follow the same recursive logic of the PW method detailed above. VC method is illustrated in Supplementary Fig. 15.

Finally, we process the small components in the SC list in a context-dependent manner, with small isolated SCs included in the final nucleus segmentation result. Multiple examples of nucleus segmentation are presented in Figs. 1, 4, and 6 as well as in Supplementary Figs. 7–10, 12, 13, and 15, where the contours of nuclei are outlined in white.

Processing stage 4: cell segmentation

We segment cells via the joint use of a posteriori global mask for the cell membranes ${{{{\bf{M}}}}}_{cell}^{g}(x,y):= {{{{\bf{M}}}}}^{g}(x,y){| }_{{{{\rm{label}}}} = 2}$ and the segmented nuclei.

We begin by initializing the segmented cell mask as the segmented nucleus mask. The cell mask is then expanded till its boundary coincides with that of the closest cell membrane around it. It is possible that the cell membrane marker used for cell segmentation is not expressed by all cells. Therefore, for cells without any cell membrane marker expression, the nucleus mask is morphologically dilated a small amount u₀ (1–10 pixels) to obtain an estimate of the cell membrane. u₀ with its 9 pixels default value is one more algorithm parameter (Supplementary Table 2). In the opposite scenario, where due to the nature of the tissue section, a cell is present with a membrane but without a nucleus, we utilize ${{{{\bf{M}}}}}_{cell}^{g}$. Specifically, the skeleton of ${{{{\bf{M}}}}}_{cell}^{g}$ is computed and subtracted from ${{{{\bf{M}}}}}_{cell}^{g}$ itself. This operation naturally reveals the cell membrane contour within ${{{{\bf{M}}}}}_{cell}^{g}$, which we identify via computing the Euler number of its connected component. When the Euler number is zero and the area of the connected component exceeds half of the average area of nuclei, the connected component is identified as the segmented cell. Examples of cell segmentation are presented in Figs. 1, 4, and 6 as well as in Supplementary Figs. 4–10, 12, 13, and 15, where the contours of the segmented cells are outlined in green.

Performance evaluation

To evaluate UNSEG performance and compare it with Cellpose²⁵ and Mesmer²⁶ results, we used the F₁ score (or Dice coefficient) as the accuracy metric⁴⁶. To compute the F₁ score, we first estimated the true positive (TP), false positive (FP) and false negative (FN) values by comparing the predicted segmentation with the expert annotated ground truth and using intersection over union (IoU) as the threshold value⁴⁶. The IoU threshold, ranging from 0 to 1, indicates how much of an overlap between the predicted segmentation and ground truth is considered a match, which is then used to estimate the number of TP, FP, and FN segmented objects. The F₁ score is then given by

$${F}_{1}=\frac{2\,TP}{2\,TP+FP+FN}.$$

(7)

Varying the IoU threshold from 0 to 1, gives us the corresponding F₁ curve as a function of the IoU threshold.

Statistics and reproducibility

Statistical robustness of UNSEG, and its reproducibility has been exhaustively tested on the GIT dataset and three publicly available datasets^47,48.

Data availability

The gastrointestinal tissue (GIT) dataset is available at https://doi.org/10.7303/syn61804540.

Code availability

The Python implementation⁶⁷ of the UNSEG is available at https://github.com/uttamLab/UNSEG.git. UNSEG is also available at https://doi.org/10.5281/zenodo.13117814. Both Linux and Windows versions of Cellthon are publicly available at https://github.com/uttamLab/cellthon.git.

References

Wählby, C., Erlandsson, F., Bengtsson, E. & Zetterberg, A. Sequential immunofluorescence staining and image analysis for detection of large numbers of antigens in individual cell nuclei. Cytometry: J. Int. Soc. Anal. Cytol. 47, 32–41 (2002).
Article Google Scholar
Schubert, W. et al. Analyzing proteome topology and function by automated multidimensional fluorescence microscopy. Nat. Biotechnol. 24, 1270–1278 (2006).
Article CAS PubMed Google Scholar
Zrazhevskiy, P. & Gao, X. Quantum dot imaging platform for single-cell molecular profiling. Nat. Commun. 4, 1619 (2013).
Article PubMed Google Scholar
Gerdes, M. J. et al. Highly multiplexed single-cell analysis of formalin-fixed, paraffin-embedded cancer tissue. Proc. Natl. Acad. Sci. USA 110, 11982–11987 (2013).
Article CAS PubMed PubMed Central Google Scholar
Giesen, C. et al. Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nat. Methods 11, 417–422 (2014).
Article CAS PubMed Google Scholar
Angelo, M. et al. Multiplexed ion beam imaging of human breast tumors. Nat. Med. 20, 436–442 (2014).
Article CAS PubMed PubMed Central Google Scholar
Lin, J.-R., Fallahi-Sichani, M. & Sorger, P. K. Highly multiplexed imaging of single cells using a high-throughput cyclic immunofluorescence method. Nat. Commun. 6, 8390 (2015).
Article CAS PubMed Google Scholar
Keren, L. et al. A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging. Cell 174, 1373–1387 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gut, G., Herrmann, M. D. & Pelkmans, L. Multiplexed protein maps link subcellular organization to cellular states. Science 361, eaar7042 (2018).
Article PubMed Google Scholar
Lin, J.-R. et al. Highly multiplexed immunofluorescence imaging of human tissues and tumors using t-cycif and conventional optical microscopes. elife 7, e31657 (2018).
Article PubMed PubMed Central Google Scholar
Goltsev, Y. et al. Deep profiling of mouse splenic architecture with codex multiplexed imaging. Cell 174, 968–981 (2018).
Article CAS PubMed PubMed Central Google Scholar
Radtke, A. J. et al. Ibex: a versatile multiplex optical imaging approach for deep phenotyping and spatial analysis of cells in complex tissues. Proc. Natl. Acad. Sci. USA 117, 33455–33465 (2020).
Article CAS PubMed PubMed Central Google Scholar
Schürch, C. M. et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell 182, 1341–1359 (2020).
Article PubMed PubMed Central Google Scholar
Phillips, D. et al. Highly multiplexed phenotyping of immunoregulatory proteins in the tumor microenvironment by codex tissue imaging. Front. Immunol. 12, 687673 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kinkhabwala, A. et al. Macsima imaging cyclic staining (mics) technology reveals combinatorial target pairs for car t cell treatment of solid tumors. Sci. Rep. 12, 1911 (2022).
Article CAS PubMed PubMed Central Google Scholar
Hickey, J. W. et al. Spatial mapping of protein composition and tissue organization: a primer for multiplexed antibody-based imaging. Nat. Methods 19, 284 – 295 (2022).
Article PubMed Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1–9 (2012).
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).
Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, 234–241 (Springer, 2015).
Van Valen, D. A. et al. Deep learning automates the quantitative analysis of individual cells in live-cell imaging experiments. PLoS Comput. Biol. 12, e1005177 (2016).
Article PubMed PubMed Central Google Scholar
Al-Kofahi, Y., Zaltsman, A. B., Graves, R., Marshall, W. A. & Rusu, M. A deep learning-based algorithm for 2-d cell segmentation in microscopy images. BMC Bioinforma. 19, 1–11 (2018).
Schmidt, U., Weigert, M., Broaddus, C. & Myers, E. W. Cell detection with star-convex polygons. In International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer International Publishing, 2018).
Yang, L. et al. Nuset: a deep learning tool for reliably separating and analyzing crowded cells. PLoS Comput. Biol. 16, e1008193 (2019).
Hollandi, R. et al. nucleaizer: a parameter-free deep learning framework for nucleus segmentation using image style transfer. Cell Syst. 10, 453–458 (2020).
Article CAS PubMed PubMed Central Google Scholar
Stringer, C., Wang, T., Michaelos, M. & Pachitariu, M. Cellpose: a generalist algorithm for cellular segmentation. Nat. Methods 18, 100–106 (2020).
Article PubMed Google Scholar
Greenwald, N. F. et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat. Biotechnol. 40, 555–565 (2021).
Article PubMed PubMed Central Google Scholar
Yapp, C. et al. Unmicst: deep learning with real augmentation for robust segmentation of highly multiplexed images of human tissues. Commun. Biol. 5, 1263 (2022).
Article PubMed PubMed Central Google Scholar
Lee, M. Y. et al. Cellseg: a robust, pre-trained nucleus segmentation and pixel quantification software for highly multiplexed fluorescence images. BMC Bioinforma. 23, 46 (2022).
Article Google Scholar
Blazek, P. J. & Lin, M. M. Explainable neural networks that simulate reasoning. Nat. Comput. Sci. 1, 607–618 (2021).
Article PubMed Google Scholar
Stringer, C. & Pachitariu, M. Cellpose 2.0: how to train your own model. Nat. Methods 19, 1634–1641 (2022).
Article PubMed PubMed Central Google Scholar
Lin, G. et al. A hybrid 3d watershed algorithm incorporating gradient cues and object models for automatic segmentation of nuclei in confocal image stacks. Cytom. Part A 56, 23–36 (2003).
Lin, G. et al. Hierarchical, model based merging of multiple fragments for improved three dimensional segmentation of nuclei. Cytom. Part A 63A, 20–33 (2005).
Article Google Scholar
Li, G. et al. 3d cell nuclei segmentation based on gradient flow tracking. BMC Cell Biol. 8, 40, 1–10 (2007).
Article Google Scholar
Li, G. et al. Segmentation of touching cell nuclei using gradient flow tracking. J. Microsc. 231, 47–58 (2008).
Article CAS PubMed Google Scholar
Coelho, L. P., Shariff, A. & Murphy, R. F. Nuclear segmentation in microscope cell images: A hand-segmented dataset and comparison of algorithms. In 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro 518–521 (IEEE, 2009).
Lou, X., Koethe, U., Wittbrodt, J. & Hamprecht, F. A. Learning to segment dense cell nuclei with shape prior. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, 1012–1018 (IEEE, 2012).
Qi, J. et al. Drosophila eye nuclei segmentation based on graph cut and convex shape prior. In 2013 IEEE International Conference on Image Processing, 670–674 (IEEE, 2013).
Stoeger, T., Battich, N., Herrmann, M. D., Yakimovich, Y. & Pelkmans, L. Computer vision for image-based transcriptomics. Methods 85, 44–53 (2015).
Article CAS PubMed Google Scholar
Isack, H., Gorelick, L., Ng, K., Veksler, O. & Boykov, Y. in Lecture Notes in Computer Science (eds Ferrari, V., Hebert, M., Sminchisescu, C. & Weiss, Y.) 38–54 (Springer, 2018).
Kostrykin, L., Schnörr, C. & Rohr, K. Globally optimal segmentation of cell nuclei in fluorescence microscopy images using shape and intensity information. Med. image Anal. 58, 101536 (2019).
Article CAS PubMed Google Scholar
Winter, M. R. et al. Separating touching cells using pixel replicated elliptical shape models. IEEE Trans. Med. Imaging 38, 883–893 (2019).
Article PubMed Google Scholar
Xie, X. et al. Instance-aware self-supervised learning for nuclei segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part V 23, 341–350 (Springer, 2020).
Wolf, S., Lalit, M., McDole, K. & Funke, J. Unsupervised learning of object-centric embeddings for cell instance segmentation in microscopy images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 21263–21272 (IEEE, 2023).
Gonzalez, R. C. & Woods, R. E. Digital Image Processing (Pearson, 2018).
Toennies, K. D. Guide to Medical Image Analysis (Springer, 2017).
Caicedo, J. C. et al. Evaluation of deep learning strategies for nucleus segmentation in fluorescence images. Cytom. Part A 95, 952–965 (2019).
Article Google Scholar
Aleynick, N. et al. Cross-platform dataset of multiplex fluorescent cellular object image annotations. Sci. Data https://api.semanticscholar.org/CorpusID:257986696 (2023).
Aleynick, N. et al. Cross-platform dataset of multiplex fluorescent cellular object image annotations [dataset]. Synapse https://doi.org/10.7303/SYN27624812 (2023).
Human biomolecular atlas program HBM439.HFGX.695. https://portal.hubmapconsortium.org/browse/dataset/54eec389e909636837ccb11958035552 (2023).
Maki, C. G. in p53. 117–126 (Springer, 2010).
Huang, J. et al. The noncanonical role of ezh2 in cancer. Cancer Sci. 112, 1376–1382 (2021).
Article CAS PubMed PubMed Central Google Scholar
O’Connor, C. Fluorescence in situ hybridization. Nat. Methods 2, 237–238 (2005).
Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).
Article PubMed PubMed Central Google Scholar
Moffitt, J. R. et al. High-performance multiplexed fluorescence in situ hybridization in culture and tissue with matrix imprinting and clearing. Proc. Natl. Acad. Sci. USA 113, 14456–14461 (2016).
Article CAS PubMed PubMed Central Google Scholar
Eng, C.-H. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqfish+. Nature 568, 235–239 (2019).
Article CAS PubMed PubMed Central Google Scholar
Mateo, L. J. et al. Visualizing DNA folding and RNA in embryos at single-cell resolution. Nature 568, 49–54 (2019).
Article CAS PubMed PubMed Central Google Scholar
Xia, C., Fan, J., Emanuel, G., Hao, J. & Zhuang, X. Spatial transcriptome profiling by merfish reveals subcellular RNA compartmentalization and cell cycle-dependent gene expression. Proc. Natl. Acad. Sci. USA 116, 19490–19499 (2019).
Article CAS PubMed PubMed Central Google Scholar
Caron, M., Bojanowski, P., Joulin, A. & Douze, M. in Lecture Notes in Computer Science. (eds Ferrari, V., Hebert, M., Sminchisescu, C. & Weiss, Y.) 139–156 (Springer, 2018).
Lucas, T., Weinzaepfel, P. & Rogez, G. Barely-supervised learning: semi-supervised learning with very few labeled images. Proc. AAAI Conf. Artif. Intell. 36, 1881–1889 (2022).
Google Scholar
Arazo, E., Ortego, D., Albert, P., O’Connor, N. E. & McGuinness, K. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In 2020 International Joint Conference on Neural Networks (IJCNN) 1–8 (IEEE, 2020).
Zheltonozhskii, E., Baskin, C., Mendelson, A., Bronstein, A. M. & Litany, O. Contrast to divide: self-supervised pre-training for learning with noisy labels. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 1657–1667 (IEEE, 2022).
Van Rossum, G. The Python Library Reference, release 3.8.2 (Python Software Foundation, 2020).
Chityala, R. & Pudipeddi, S. Image Processing and Acquisition Using Python (Chapman and Hall/CRC, 2020).
Otsu, N. A threshold selection method from gray level histograms. IEEE Trans. Syst. Man, Cybern. 9, 62–66 (1979).
Article Google Scholar
Liao, P.-S., Chen, T.-S. & Chung, P. C. A fast algorithm for multilevel thresholding. J. Inf. Sci. Eng. 17, 713–727 (2001).
Google Scholar
Saint-Marc, P., Chen, J.-S. & Medioni, G. Adaptive smoothing: a general tool for early vision. IEEE Trans. Pattern Anal. Mach. Intell. 13, 514–529 (1991).
Article Google Scholar
Kochetov, B. & Uttam, S. UNSEG: unsupervised segmentation of cells and their nuclei in complex tissue samples. zenodo. https://doi.org/10.5281/zenodo.13117814 (2024).

Download references

Acknowledgements

This project was supported in part by the National Institutes of Health through Grant Number UL1TR001857.

Author information

Authors and Affiliations

Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
Bogdan Kochetov, Rebecca Raphael & Shikhar Uttam
UPMC Hillman Cancer Center, Pittsburgh, PA, USA
Bogdan Kochetov, Rebecca Raphael, Brian J. Leibowitz, Randall E. Brand, Brenda Diergaarde & Shikhar Uttam
Department of Pathology, University of Pittsburgh, Pittsburgh, PA, USA
Phoenix D. Bell, Paulo S. Garcia, Karen Schoedel & Aatur Singhi
Pathology and Laboratory Medicine Institute, Cleveland Clinic, Cleveland, OH, USA
Phoenix D. Bell
University Hospitals Cleveland Medical Center, Case Western Reserve University, Cleveland, OH, USA
Akram S. Shalaby
Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
Benjamin Raymond
Department of Radiation Oncology, University of Pittsburgh, Pittsburgh, PA, USA
Brian J. Leibowitz
Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
Rhonda M. Brand, Randall E. Brand & Robert E. Schoen
Magee Womens Research Institute, Pittsburgh, PA, USA
Rhonda M. Brand
Department of Medicine, University of Southern California, Los Angeles, CA, USA
Jian Yu & Lin Zhang
Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA
Brenda Diergaarde

Authors

Bogdan Kochetov
View author publications
Search author on:PubMed Google Scholar
Phoenix D. Bell
View author publications
Search author on:PubMed Google Scholar
Paulo S. Garcia
View author publications
Search author on:PubMed Google Scholar
Akram S. Shalaby
View author publications
Search author on:PubMed Google Scholar
Rebecca Raphael
View author publications
Search author on:PubMed Google Scholar
Benjamin Raymond
View author publications
Search author on:PubMed Google Scholar
Brian J. Leibowitz
View author publications
Search author on:PubMed Google Scholar
Karen Schoedel
View author publications
Search author on:PubMed Google Scholar
Rhonda M. Brand
View author publications
Search author on:PubMed Google Scholar
Randall E. Brand
View author publications
Search author on:PubMed Google Scholar
Jian Yu
View author publications
Search author on:PubMed Google Scholar
Lin Zhang
View author publications
Search author on:PubMed Google Scholar
Brenda Diergaarde
View author publications
Search author on:PubMed Google Scholar
Robert E. Schoen
View author publications
Search author on:PubMed Google Scholar
Aatur Singhi
View author publications
Search author on:PubMed Google Scholar
Shikhar Uttam
View author publications
Search author on:PubMed Google Scholar

Contributions

S.U. conceived the idea. B.K. and S.U. designed the overall UNSEG algorithm and planned the key steps. B.K. wrote the UNSEG code and performed the analysis. R.R. performed the immunofluorescence labeling and acquired the imaging data. B.R., B.K., R.R., and S.U. identified the 75 images for the GIT dataset. P.D.B., P.S.G., A.S.S., and A.S. provided expert annotation of the GIT dataset images. B.J.L., L.S. R.M.B., R.E.B., J.Y., L.Z., B.D., R.E.S., and A.S. helped in tissue sample acquisition, assay optimization, and data generation. B.K. and S.U. wrote the manuscript. All authors reviewed and edited the manuscript before submission.

Corresponding author

Correspondence to Shikhar Uttam.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editors: Aylin Bircan and Laura Rodriguez Perez. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Peer Review File

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Kochetov, B., Bell, P.D., Garcia, P.S. et al. UNSEG: unsupervised segmentation of cells and their nuclei in complex tissue samples. Commun Biol 7, 1062 (2024). https://doi.org/10.1038/s42003-024-06714-4

Download citation

Received: 23 April 2024
Accepted: 09 August 2024
Published: 30 August 2024
DOI: https://doi.org/10.1038/s42003-024-06714-4

Subjects

Abstract

Similar content being viewed by others

Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning

UnMICST: Deep learning with real augmentation for robust segmentation of highly multiplexed images of human tissues

A novel deep learning-based 3D cell segmentation framework for future image-based disease detection

Introduction

Results

UNSEG principle and design

Perturbed watershed

Virtual cuts

New dataset for benchmarking segmentation performance

UNSEG benchmarking using GIT and publicly available datasets

Applicability of UNSEG to different practical scenarios

F 1 score and accuracy

UNSEG characteristics and use case

Discussion

Methods

Generation of GIT dataset and other images

UNSEG algorithm

Input image

Processing stage 1: computing a priori nucleus and cell membrane masks

Processing stage 2: computing a posteriori nucleus and cell membrane masks

Processing stage 3: nucleus segmentation

Processing stage 4: cell segmentation

Performance evaluation

Statistics and reproducibility

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Peer Review File

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

F ₁ score and accuracy