Extended Data Fig. 2: Performance evaluation of originally trained CDR3β-Only models on seen- and unseen-epitope predictions based on CDR3β-only data.
From: Assessment of computational methods in predicting TCR–epitope binding recognition

a-b, Amino acid distribution of CDR3β sequences starting with C and ending with F (a) and of CDR3β sequences not starting with C and ending with F (b). c-d, Performance of original CDR3β-only models in seen-epitope (c) and unseen-epitope test (d) using AS negatives based on CDR3β-only data in terms of multiple metrics: AUPRC, Precision, Specificity, Recall, F1. e-f, Performance of CDR3β-only models on three seen epitopes using PS negatives (e) and HS negatives (f). g, AUPRC comparison of originally trained CDR3β-only models (n = 31) using AS/PS/HS negatives in seen-epitope test. h-i, AUPRC correlation between the seen-epitope test results of original CDR3β-only models (n = 31) obtained using AS and PS negatives (h) and using AS and HS negatives (i). j-k, Performance of CDR3β-only models on unseen epitopes using PS negatives (j) and HS negatives (k). l, AUPRC comparison of originally trained CDR3β-only models (n = 28) using AS/PS/HS negatives in unseen-epitope test. m-n, AUPRC correlation between the unseen-epitope test results of original CDR3β-only models (n = 28) obtained using AS and PS negatives (m) and using AS and HS negatives (n). Heatmaps (e, f, j, k) show epitope-level AUPRC, with adjacent bar charts showing overall AUPRC. Colored dots (g, l) represent individual model AUPRC, black dots indicate mean, error bars represent the mean ± SD. P-values of Pearson correlations (h, i, m, n) were from two-sided t-test.