Fig. 3: mvTCR’s multimodal representation efficiently captures antigen specificity information.

a Predictions of antigen specificity were made on the 10x Genomics dataset for all donors (10x Full), donors 1-4 separately (D1-D4), and the Minervina dataset. Each score represents the average over five random splits (n = 5). b Correlation between the average TCR-Contribution and the fraction of the F1-Score between the TCR and RNA model for each of the five splits of the six datasets (n = 30). r indicates the Pearson correlation coefficient. The line marks the linear regression fit with the 95% confidence interval as error band. c Comparison between mvTCR trained only on the gene expression and the CDR3β sequence, and tessa11 on the tasks defined in a (p-values: \({p}_{F1}=4.67*{10}^{-6}\), \({p}_{{NMI}}=4.62*{10}^{-6}\)). d Avidity prediction measured by mean squared logarithmic error (MSLE, p-values: \({p}_{{mvTCR}-{RNA}}=0.0452\)) and Pearson correlation (p-values: \({p}_{{RNA}-{TCR}}=6.26*{10}^{-4}\), \({p}_{{mvTCR}-{RNA}}=5.95*{10}^{-3}\), \({p}_{{mvTCR}-{TCR}}=3.24*{10}^{-10}\)) on each of the five splits and eight specificities of the five versions of the 10x Genomics dataset (n = 200). All box plots indicate the data quartiles with the whiskers extending to the full distribution excluding outliers outside the 1.5 interquartile range while the median is indicated as a horizontal line. e Influence of mvTCR’s training set size on prediction performance at varying dataset sizes (\({n}_{100-{{{{\mathrm{2,500}}}}}}=30\), \({n}_{{{{{\mathrm{5,000}}}}}}=25\), \({n}_{{{{{\mathrm{10,000}}}}}}=15\), \({n}_{{{{{\mathrm{15,000}}}}}}=10\)). Statistical significance (p-values: *<0.05, **<0.01, ***<0.001, baseline indicated left) to the corresponding unimodal representation or the tessa algorithm is calculated via one-sided, paired t-test. The bars and lines represent the average metric score, while the error bars and error bands indicate the 95% confidence interval.