Fig. 2: Performance of deepAMP.

a Temporin-Ali optimised performance in matrix score, comparing deepAMP-TOM (purple), Baseline-T (orange), Random mutants (blue), PepCVAE (green) and HydrAMP (red) in one iteration (n = 91 candidate sequences), b the second iteration (n = 91 candidate sequences) and c the third iteration (n = 91 candidate sequences). The dashed line represents the matrix score for Temporin-Ali (matrix score is −0.17). Boxplots show the median (center line), and 1st and 3rd quartiles (Q1 and Q3, respectively). The whiskers (error bars) indicate the range of the data, defined as the range between Q1-1.5*IQR and Q3 + 1.5*IQR. d Temporin-Ali optimised performance in deepAMP-predict score (n = 91 candidate sequences). The dashed line represents the predict score for Temporin-Ali (predict score is 0.93). Violin plots show the median (white point), and 1st and 3rd quartiles (Q1 and Q3, respectively). The upper and lower bounds of the violin represent the minimum and maximum values of the data. e Pg-AMP1 optimised performance in fitness score, comparing deepAMP-GOM (purple), Random mutants (blue), PepCVAE (green) and HydrAMP (red) to the 100 candidates with the highest fitness scores in iteration (n = 100 candidate sequences). The dashed line represents the fitness score for four fragments of Pg-AMP1 (fitness score are 0.075, 0.049, 0.046, 0.012, respectively). Boxplots show the median (center line), and 1st and 3rd quartiles (Q1 and Q3, respectively). The whiskers (error bars) indicate the range of the data, defined as the range between Q1 and 1.5*IQR and Q3 + 1.5*IQR. f Pg-AMP1 optimized performance in deepAMP-predict score (n = 100 candidate sequences). The dashed line represents the predict score for four fragments of Pg-AMP1 (predict score are 0.57, 0.40, 0.07, 0.04, respectively). Violin plots show the median (white point), and 1st and 3rd quartiles (Q1 and Q3, respectively). The upper and lower bounds of the violin represent the minimum and maximum values of the data. (g) The visualization of sequences under the UMAP 2-dimensional space (Source data are provided as a Source Data file).