Fig. 4: Integration of multi-dimensional metrics for immunogenicity prediction.

a Comparison of allele aggregation methods. Results of the one-sided Mann-Whitney U test assess the association between TCR recognition metrics and T cell activities. The number of peptides was n = 1448 for MHC-I and n = 1373 for MHC-II. “Best_bind” indicates the best-binding allele approach, while “masked_max” refers to the masked maximum approach. b Association test for individual metrics in the NCI cohort (n = 6952 peptides). One-sided Mann-Whitney U test results evaluate the relationship between each metric and T cell activities, where “NP” denotes “NeoPrecis”, “I” represents MHC-I, “II” represents MHC-II, “A” denotes abundance metrics, “P” denotes presentation metrics, and “R” denotes recognition metrics. “RNA_EXP_QRT” denotes the quartile of RNA expression level (TPM). Agretopicity denotes the agretopicity ratio. c AUROC values for 4-fold cross-validation repeated 100 times on the NCI cohort using different combinations of mutation-centric features (n = 6952; CD8+ positive = 56 for MHC-I; CD4+ positive = 66 for MHC-II). In the boxplots, the center line represents the median of 100 repeats, boxes indicate the interquartile range (IQR; 25th–75th percentile), and whiskers extend to the most extreme data points within 1.5× the IQR. Outliers are depicted as individual circles. d Feature importance in the logistic regression model integrating multi-dimensional metrics. Features were standardized prior to modeling, and importance values are derived from logistic regression coefficients.