Fig. 4: Differential expression and tumor specificity profiles of individual protein markers.

a Pooled differential expression of all proteins with at least 20% coverage across the indicated subset samples (from left to right 6499, 6347 and 6343 proteins); Benjamini-Hochberg corrected Mann-Whitney-U q-values; dashed horizontal line is q = 0.05, vertical lines are absolute log2 fold change (FC) = 0.5; direction of differential expression as indicated by each subplot title; top 10 proteins by FC and q-value are annotated; selected proteins referenced in the text in bold font; b Exemplary plot of an individualized null distribution (blue) and the corresponding tumor profile (red); PSM: peptide spectrum match; the null distribution was obtained through pair-wise comparisons of the corresponding healthy mucosa with the other healthy mucosae of the same TMT set; the tumor profile was then compared to a PSM-filtered subset of this distribution; the dashed line shows the quantiles/significance thresholds (p < 0.05) for the tumor profile (please see methods for details); these profiles were calculated for all 192 paired samples and were used to count the expression rates in subplot c. c The top 75 proteins with the highest tumor-specific overexpression across all samples and their expression rates across stages and subtypes (please see text for details; asterisk: referenced in the text); red/blue are the rates of significant over- and underexpression, respectively; d Variances of the group-wise overexpression rates for each protein from c with the samples being grouped by histopathological stages (pTNM) or proteomic (PAULA) subtypes; n = 75 proteins; violin plot and boxplot respectively, whiskers show 95%-interval, box is interquartile range, horizontal line is median, horizontal bar is mean (also in subplots e, f); p-value from two-sided Mann-Whitney-U (also in subplot e). e Mean fold change (FC) in analogy to subplot d; n = 75 proteins; f Exemplary expression profiles of the high-ranking proteins MAL2 and TMSB10 as well as the antibody-drug conjugate targets EpCAM and Nectin-4; n = 434/434/419/389 biologically independent samples, respectively. Source data are provided as a Source Data file.