Table 2 Input features to the GNN architecture

From: Geometric deep learning improves generalizability of MHC-bound peptide predictions

Feature and shape and type

Description

NODE FEATURES

res_type

[21, 1], bool

One-hot representation of the node’s residue (20 amino acids + unknown).

res_size

[1], int

The number of non-hydrogen atoms in the side chain.

res_mass

[1], float

The average residue mass in Da.

res_charge

[1], int

The charge of the residue in fully protonated state in Coulomb.

res_pI

[1], float

The isoelectric point, i.e., the pH at which the molecule has no net electric charge.

polarity

[4, 1], bool

One-hot representation of the polarity of the amino acid: NONPOLAR, POLAR, NEGATIVE, POSITIVE.

hb_donors

[1], int

The number of hydrogen bond donor atoms in the residue.

hb_acceptors

[1], int

The number of hydrogen bond acceptor atoms in the residue.

res_depth

[1], float

The average distance in Å of the residue to the closest molecule of bulk water, computed via BioPython.

hse

[3, 1], float

Half sphere exposure, which indicates how buried an amino acid residue is in the biomolecule, computed via BioPython.

sasa

[1], float

Solvent-accessible surface area, in Å2, computed via FreeSASA.

bsa

[1], float

Buried surface area, which represents the area of the complex interface, in Å2, computed via FreeSASA.

irc_total

[1], int

The number of residues on the other chain that are within a cutoff distance of 5.5 Å.

irc_negative_negative

irc_negative_positive, irc_nonpolar_negative, irc_nonpolar_nonpolar, irc_nonpolar_polar, irc_nonpolar_positive, irc_polar_negative, irc_polar_polar, irc_polar_positive, irc_positive_positive

[1], int

As above, but for specific residue polarity pairings.

EDGE FEATURES

same_chain

[1], bool

Boolean indicating whether the edge connects nodes belonging to the same chain (1) or separate chains (0).

covalent

[1], bool

Boolean indicating whether nodes are covalently bound (1) or not (0). Covalency is not directly assessed, but any edge with a maximum distance of 2.1 Å is considered covalent.

electrostatic

[1], float

Electrostatic potential (also known as Coulomb potential) between two nodes, calculated using interatomic distances and charges of each atom.

vanderwaals

[1], float

Van der Waals potential between two nodes, calculated using interatomic distances and a list of atoms with Van der Waals parameters.

distance

[1], float

Interatomic distance between atoms in Å, computed from the xyz atomic coordinates taken from the PDB file.

  1. Full features details can be found at https://deeprank2.readthedocs.io/en/latest/features.html.