Table 2 Hyper-parameter settings for each dataset.

Hyperparameter	ZINC-12K	CIFAR10	PATTERN	CLUSTER	MNIST	ogbg-molhiv	ogbg-molpcba	Peptides-func	Peptides-struct
Transformer Layers	10	3	10	16	3	10	5	4	4
Hidden dim	64	52	64	48	52	64	384	96	96
Heads	8	4	8	4	4	4	4	4	8
Dropout	0	0	0	0.01	0	0.05	0.3	0	0.05
Attention dropout	0.2	0.5	0.2	0.5	0.5	0.5	0.5	0.5	0.5
Graph pooling	Sum	Mean	–	Mean	Mean	Mean	Mean	Mean	Mean
Epochs	2000	200	100	100	200	100	100	200	200

Quick links

Search