Table 26 Model performance-CoauthorCS dataset.

From: Distilling knowledge from graph neural networks trained on cell graphs to non-neural student models

Model

Acc_Train ± std

Acc_Val ± std

Acc_test ± std

F1_train ± std

F1_Val ± std

F1_test ± std

Teacher Model

0.9982±0.0005

0.9392±0.0006

0.9406±0.0039

0.9982±0.0005

0.9393±0.0007

0.9406±0.0040

XGBoost trained on hard labels

0.9591 ± 0.0000

0.8857 ± 0.0000

0.8664 ± 0.0000

0.9593 ± 0.0000

0.8865 ± 0.0000

0.8684 ± 0.0000

XGBoost trained on logits

0.8039 ± 0.0000

0.7657 ± 0.0000

0.7650 ± 0.0000

0.8028 ± 0.0000

0.7627 ± 0.0000

0.7633 ± 0.0000

XGBoost trained on calibrated

probs using IR

0.9524 ± 0.0000

0.8837 ± 0.0000

0.8729 ± 0.0000

0.9521 ± 0.0000

0.8800 ± 0.0000

0.8710 ± 0.0000

XGBoost trained on calibrated

probs using temp scaling-BS

0.9523 ± 0.0000

0.8843 ± 0.0000

0.8746 ± 0.0000

0.9520 ± 0.0000

0.8813 ± 0.0000

0.8728 ± 0.0000

ExtraTrees trained on hard labels

0.7617± 0.0063

0.7504 ±0.003

0.7470 ± 0.010

0.76490 ± 0.0064

0.7530 ±0.0027

0.7511± 0.01022

ExtraTrees trained on logits

0.8102 ± 0.0029

0.7918 ± 0.0054

0.7819 ± 0.0066

0.8063 ± 0.0018

0.7849 ± 0.0041

0.7811 ± 0.0058

ExtraTrees trained on Calibrated

probs using IR

0.9439 ± 0.0008

0.8892 ± 0.0011

0.8805 ± 0.0033

0.9435 ± 0.0009

0.8860 ± 0.0013

0.8788 ± 0.0035

ExtraTrees trained on Calibrated

probs using temp scaling-BS

0.9448 ± 0.0003

0.8904 ± 0.0019

0.8813 ± 0.0053

0.9444 ± 0.0003

0.8872 ± 0.0017

0.8793 ± 0.0050

HistGrad trained on hard labels

0.9758 ± 0.0005

0.8835 ± 0.0023

0.8771 ± 0.0011

0.9759 ± 0.0006

0.8844 ± 0.0026

0.8781 ± 0.0011

HistGrad trained on logits

0.8455 ± 0.0028

0.8144 ± 0.0059

0.7995 ± 0.0032

0.8448 ± 0.0027

0.8111 ± 0.0051

0.7997 ± 0.0037

HistGrad trained on calibrated

probs using IR

0.9406 ± 0.0009

0.8961 ± 0.0015

0.8916 ± 0.0015

0.9399 ± 0.0010

0.8935 ± 0.0015

0.8905 ± 0.0016

HistGrad trained on calibrated probs using temp scaling-BS

0.9403 ± 0.0012

0.8969 ± 0.0037

0.8900 ± 0.0019

0.9396 ± 0.0012

0.8946 ± 0.0038

0.8887 ± 0.0016

Random Forest trained on

hard labels

0.6541 ± 0.0049

0.6326 ± 0.0033

0.6337 ± 0.0089

0.6647± 0.0059

0.6422 ± 0.004720

0.64595 ± 0.0105

Random Forest trained on logits

0.7938 ± 0.0043

0.7724 ± 0.0027

0.7581 ± 0.0048

0.7939 ± 0.0044

0.7716 ± 0.0028

0.7607 ± 0.0000

Random Forest trained on calibrated

probs using IR

0.9369 ± 0.0009

0.8759 ± 0.0020

0.8680 ± 0.0020

0.9366 ± 0.0009

0.8734 ± 0.0021

0.8659 ± 0.0020

Random trained on calibrated

probs using temp scaling-BS

0.9374 ± 0.0006

0.8777 ± 0.0025

0.8675 ± 0.0008

0.9372 ± 0.0007

0.8752 ± 0.0028

0.8657 ± 0.0006

LightGBM trained on hard labels

0.9861 ± 0.0000

0.8920 ± 0.0000

0.8768 ± 0.0000

0.9861 ± 0.0000

0.8922 ± 0.0000

0.8772 ± 0.0000

LightGBM trained on logits

0.8578 ± 0.0000

0.8271 ± 0.0000

0.8086 ± 0.0000

0.8564 ± 0.0000

0.8240 ± 0.0000

0.8076 ± 0.0000

LightGBM trained on Calibrated

probs using IR

0.9497 ± 0.0000

0.9007 ± 0.0000

0.8909 ± 0.0000

0.9493 ± 0.0000

0.8987 ± 0.0000

0.8896 ± 0.0000

LightGBM trained on Calibrated

probs using temp scaling-BS

0.9473 ± 0.0000

0.9045 ± 0.0000

0.8980 ± 0.0000

0.9467 ± 0.0000

0.9027 ± 0.0000

0.8970 ± 0.0000

  1. Note: The logits represent the raw outputs of the teacher model. IR denotes Isotonic Regression, BS denotes Brier score reduction, and LL denotes log loss reduction. Values in bold denote the performance of student models that learned well from the teacher model and outperformed their counterparts trained on hard labels. Std denotes the standard deviation. Accuracy and weighted F1-scores are reported to four decimal places. Values may appear identical (especially for the teacher) due to rounding but can differ at higher precision (>6 decimal places)