Fig. 1: APEX exploration of the archaeome. | Nature Microbiology

Fig. 1: APEX exploration of the archaeome.

From: Deep learning reveals antibiotics in the archaeal proteome

Fig. 1

a, Archaeal proteomes were systematically scanned to identify EPs with potential antimicrobial activity. Circular bars denote the log10-transformed average active (red) and inactive (blue) EPs discovered by APEX. A peptide was classified as active if its predicted mean MIC against tested bacterial strains was ≤100 μmol l−1. The values were normalized by the number of proteins per organism scanned. Archaea with peptides that were synthesized are indicated by a light red square, and those experimentally validated as active are highlighted with a dark red square. b, Sequence space exploration using a similarity matrix. The graph illustrates a bidimensional sequence space visualization of peptide sequences found in DBAASP and antimicrobial EPs discovered by APEX in archaea organisms. Sequence alignment was used to generate a similarity matrix for all peptide sequences in DBAASP and the 12,623 antimicrobial EPs predicted by APEX (Supplementary Data 1 and 2). Each row in the matrix represents a feature representation of a peptide based on its amino acid composition. UMAP was applied to reduce the feature representation to two dimensions for visualization (Extended Data Fig. 2a). c, Comparison of amino acid frequency in archaeal EPs with known AMPs from the DBAASP, APD3 and DRAMP 3.0 databases (Extended Data Fig. 2b–e). d,e, Distribution of two physico-chemical properties for peptides with predicted antimicrobial activity, compared with AMPs from DBAASP, APD3 and DRAMP 3.0: net charge (d) and normalized hydrophobicity (e). Net charge influences the initial electrostatic interactions between the peptide and negatively charged bacterial membranes, whereas hydrophobicity affects interactions with lipids in the membrane bilayers (Extended Data Fig. 2). The Chi-squared test of independence of variables in a contingency table was used to compare the amino acid composition in c; P values were 0, that is, below machine precision levels, suggesting that they are statistically significant. Statistical significance in d and e was determined using two-tailed t-tests followed by a Mann–Whitney test; P < 0.0001. The solid line inside each box represents the mean value for each group.

Source data

Back to article page