Fig. 3: Self-supervised learning approach (3D-SSL) overview and performances. | Communications Biology

Fig. 3: Self-supervised learning approach (3D-SSL) overview and performances.

From: Geometric deep learning improves generalizability of MHC-bound peptide predictions

Fig. 3

A the inference stage of 3D-SSL. 3D-SSL is an EGNN network. It takes the microenvironment of a masked peptide residue as input, predicts the probability of amino acid identities of the masked residue, then converts the probability into statistical potential using Boltzmann distribution as the energy contribution of this microenvironment. For the training stage of 3D-SSL, see Supplementary Fig. 2. B Comparison of 3D-SSL (i.e., unsupervised EGNN) with the same EGNN network trained in a supervised manner and with SeqB approaches, on the allele-clustered dataset. Replicas are marked as gray dots. There are no allele overlaps between the 3D-SSL training set and the allele-clustered test set. C Data efficiency of 3D-SSL against supervised EGNN, our top StrB method. The same EGNN architecture is used by 3D-SSL and supervised EGNN (see Supplementary Note 4). We evaluate the effect of training data size (1 K, 10 K, ~90 K) on supervised EGNN performance. The allele-clustered BA dataset is randomly sampled for training the supervised EGNN. AUCs are averaged over 5 runs to account for the potentially unrepresentative subsets. Error bars show the standard deviation between the five replicas. D 3D-SSL performance in terms of AUC when trained with and without peptide-TCR structures, as schematized on the left. We repeated the experiments five times to be sure the difference between training with TCRs and without TCRs is not caused by randomness (error bars for standard deviations shown in black, replicas shown as gray dots).

Back to article page