Table 1 Description of the four valid vocabularies.

From: Labeled dataset of X-ray protein ligand images in 3D point cloud and validated deep learning models

Vocabulary

Labeling Approach

dmax

Classes

Number of Classes

Ligand Region

SP

1

Background, Atom

2

Generic Atoms and Cycles

SP

2.1

Background, Atom, C (Cycle – generic cyclic structure)

3

Generic Atoms and Cycles C347CA56

SP

1,535.2

Background, Atom, C5 (Cycle of size 5), CA5 (Aromatic Cycle of size 5), C6, CA6, C3, C4, C7

9

Atom Symbols with Groups

AtomSymbol

41.4

Background, C, O, N, PSe, Halo

6

  1. All valid vocabularies are presented with their maximum imbalance ratio (dmax) in the valid ligands list, their classes names and size.