Fig. 2: Our supervised StrB methods overview and performances. | Communications Biology

Fig. 2: Our supervised StrB methods overview and performances.

From: Geometric deep learning improves generalizability of MHC-bound peptide predictions

Fig. 2

A Overview of the pipeline employed for processing pMHC complexes and running the supervised structure-based networks. The process involves: 1) Identifying interface atoms and residues from the pMHC 3D models, 2) converting each pMHC interface into volumetric grids and graphs enriched with geometric and physico-chemical information (Tables 13), 3) Run the networks on these representations: volumetric grids for the CNN and graphs for GNN and EGNN. GNN and EGNN use similar graph topology, differing on features and message passing framework (see details in Methods). B, C The performance of StrB and SeqB methods on the shuffled test dataset and on the allele-clustered test dataset, respectively. For all the networks except MHCflurry, 5 replicas were performed by randomly re-sampling each validation set, and the error bars show the standard deviation between the five replicas. Each replica is marked as a gray dot. MHCflurry handles the separate validation sets internally and collects the networks’ outputs, as such no standard deviation could be retrieved. D AUC per allele on the allele-clustered test set. Allele name and number of test cases are reported on the x-axis, and the alleles are sorted by sequence distance with the training set. The black dashed line marks the random predictor AUC value of 0.5. Error bars show the standard deviation between the five replicas. Each replica is marked as a black dot.

Back to article page