Fig. 1: The PLGDL framework for vaccine antigen prediction.

The first step in the framework involves the establishment and evaluation of a comprehensive protective antigen dataset. Next, a pre-trained protein language model was first used for protein sequence feature extraction. Considering the richness of information in protein structures, an improved neighbor-enhanced graph convolutional network (NEGCN) model was used to represent protein structures, and the extracted features were combined with the extracted amino acid sequence features. The resulting machine learning classifier was finally used in protective vaccine antigen prediction.