Fig. 3: MARQO classification with validation.

a, Following upstream analysis and the generation of a metadata CSV for each sample, MARQO classifies cells using either the default k-means algorithm or alternative methods, such as GMM clustering or third-party tools. MARQO categorizes cells into a predefined number of clusters, which users can inspect interactively via the MARQO application. Users may further subdivide clusters into subclusters by rapidly applying k-means clustering repeatedly to the selected cluster, enabling detailed user-guided inspection of each cluster or subcluster. b, We validated with four distinct markers: FOXP3, a nuclear marker; CD3, a T cell membrane marker with a circular shape; CD68, a macrophage membrane marker with an asymmetrical shape; and PanCK, a cytoplasmic tumour marker. Scale bar, 50 µm. c, Precision versus recall curves are provided for markers CD3, FOXP3, CD68 and PanCK, comparing the MARQO classification performance with manual positivity annotations conducted by a pathologist (predictive model). A random model is included as a baseline, representing theoretical performance from random label assignment based on uniform distribution. d, Scatter plots with regression lines illustrate correlations between the total number of cells classified as positive manually by a pathologist versus MARQO across 34 distinct ROIs for the 4 selected markers. Each point represents either biopsy or resection tissue, categorized as having good- or poor-quality staining by the pathologist. ROIs were chosen to represent varied staining quality and regions with tissue damage. Non-parametric Spearman’s correlation coefficients (r values) and linear regression analyses were used for markers CD3 (P < 0.0001), FOXP3 (P = 0.0002), CD68 (P = 0.2333) and PanCK (P = 0.4167). e, Stacked bar graphs depict a representative ROI for each selected marker, illustrating the user-based cluster classification per sample and comparing it directly with manual annotations provided by the pathologist for the same ROI. Colours within bars indicate the proportions of cells classified as positive or negative. Dashed lines separate clusters selected as positive (left) or negative (right) by MARQO. Total cell counts per cluster are annotated above each bar. CD68 exemplifies user-driven reclassification into subclusters. Corresponding ROIs are available in Supplementary Fig. 5.