Table 6 Performance evaluation results.

From: K-EmoPhone: A Mobile and Wearable Dataset with In-Situ Emotion, Stress, and Attention Labels

 

Avg. F1 (SD)

F1LOW (SD)

F1HIGH (SD)

Accuracy (SD)

Valence

Baseline

0.358 (0.114)

0.000 (0.000)

0.715 (0.229)

0.597 (0.233)

Random Forest (w/o oversampling)

0.523 (0.098)

0.358 (0.238)

0.687 (0.229)

0.662 (0.115)

Random Forest (w/ oversampling)

0.539 (0.093)

0.419 (0.236)

0.659 (0.238)

0.661 (0.115)

XGBoost (w/o oversampling)

0.543 (0.104)

0.408 (0.239)

0.677 (0.216)

0.659 (0.114)

XGBoost (w/ oversampling)

0.534 (0.097)

0.428 (0.233)

0.639 (0.216)

0.635 (0.109)

Arousal

Baseline

0.364 (0.090)

0.729 (0.180)

0.000 (0.000)

0.600 (0.200)

Random Forest (w/o oversampling)

0.499 (0.087)

0.703 (0.173)

0.295 (0.181)

0.626 (0.132)

Random Forest (w/ oversampling)

0.534 (0.096)

0.670 (0.183)

0.399 (0.181)

0.623 (0.139)

XGBoost (w/o oversampling)

0.532 (0.084)

0.679 (0.177)

0.385 (0.209)

0.634 (0.115)

XGBoost (w/ oversampling)

0.529 (0.085)

0.626 (0.181)

0.433 (0.187)

0.600 (0.111)

Stress

Baseline

0.390 (0.064)

0.779 (0.129)

0.000 (0.000)

0.655 (0.168)

Random Forest (w/o oversampling)

0.469 (0.076)

0.767 (0.131)

0.171 (0.172)

0.666 (0.141)

Random Forest (w/ oversampling)

0.508 (0.062)

0.730 (0.142)

0.285 (0.155)

0.644 (0.131)

XGBoost (w/o oversampling)

0.516 (0.058)

0.734 (0.135)

0.299 (0.187)

0.656 (0.111)

XGBoost (w/ oversampling)

0.517 (0.073)

0.685 (0.160)

0.350 (0.173)

0.620 (0.120)

Task disturbance

Baseline

0.346 (0.136)

0.692 (0.271)

0.000 (0.000)

0.588 (0.294)

Random Forest (w/o oversampling)

0.517 (0.094)

0.661 (0.283)

0.372 (0.327)

0.722 (0.159)

Random Forest (w/ oversampling)

0.520 (0.081)

0.633 (0.316)

0.407 (0.317)

0.727 (0.153)

XGBoost (w/o oversampling)

0.523 (0.076)

0.626 (0.292)

0.420 (0.307)

0.708 (0.151)

XGBoost (w/ oversampling)

0.525 (0.073)

0.608 (0.280)

0.442 (0.300)

0.695 (0.155)

  1. F1LOW and F1HIGH are the F1-scores when the labels LOW and HIGH are regarded as positive classes, respectively. Avg. F1 is the average of F1LOW and F1HIGH (i.e., macro-averaged F1-score). The best performance is highlighted in bold.