Fig. 3: Three peptide groups distinguished by sequence physicochemical properties.

a All 951 putative SARS-CoV-2 peptides for HLA-A*02:01 are encoded by the physicochemical properties of amino acids at each position for unsupervised clustering. Created in BioRender59. b UMAP embedding of all peptides (Pep-UMAP) based on their physicochemical properties with peptide clusters colored (legend on the right), each dot represents a unique SARS-CoV-2 peptide. c Pep-UMAP colored with selected physicochemical properties, including average (Avg) hydrophobicity (HPhobic), average polarity, whether the SCT expressed, and whether the SCT captured CD8 T cells. Average values were calculated for all residues, including anchor and exposed residues. d Left: Clustermap of the 951 peptide antigens by their normalized physicochemical properties, revealing 3 major peptide groups (Pep-Groups). The key signatures that distinguish the individual Pep-Groups are highlighted in red boxes. Right: Clustermap of the Pep-Groups including only those peptides that captured T cells. e Pep-UMAP with densities of PG1-3 depicted, legend on the bottom. f Left: Violin plot of peptide hydrophobicity for Pep-Groups, sorted by mean value. Middle: SCT protein expression efficiency for the Pep-Groups. Right: SCT cell capture efficiency for Pep-Groups. Mean values +/- SEM are utilized for violin plots. The statistical significance was determined using the two-sided Mann-Whitney U test, and p-values are annotated on all relevant plots with exact p-values provided unless p < 0.0001.