Fig. 3: Supervised TCR sequence regression. | Nature Communications

Fig. 3: Supervised TCR sequence regression.

From: DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires

Fig. 3

a In order to test the ability of a supervised deep learning method to learn and regress continuous value outputs, we collected published single-cell data from 10x Genomics where 57,229 unique α/β pairs were collected with a count-based measurement (as a proxy for binding affinity) to 44 specific peptide-MHC (pMHC) multimers and 6 negative controls. A fivefold cross-validation strategy was employed on every antigen to obtain independently predicted regression values for every α/β pair to a given antigen and predicted vs actual counts are shown for a select three antigens. b For the shown epitopes, experimentally derived antigen-specific CDR3 β TCR sequences were collected from the McPAS-TCR database and models trained on the 10x Genomics dataset were applied to this independent dataset of TCRs to assess the classification performance via examining the ROC curves and their corresponding AUCs. c For the Flu-MP and BMLF1 epitopes where data from the 10x Genomics dataset were available to train our models, crystal structures and their corresponding TCR CDR3 sequences were also collected from The Protein Data Bank and permutation analysis was conducted to analyze the sensitivity of each residue to the predicted binding affinity from our deep learning model. The results of this model are shown for the corresponding α- and β-chain for both antigens. d To create compact representations of the information in our residue sensitivity analysis, we propose a visualization of this information termed a Residue Sensitivity Logo (RSL). e Crystal structures highlighting the relevant α (blue) and β (red) CDR3 regions. Predictive performance of Residue Sensitivity analysis to identify known contact residues shown in Supplementary Fig. 13.

Back to article page