Table 2 Results using the original data with short sentences removed.

From: Multimodal deep learning for dementia classification using text and audio

Shorts-removed

Accuracy

Precision

Recall

F1-score

AUROC

Audio

0.5958 ± 0.014

0.587 ± 0.046

0.2945 ± 0.108

0.3811 ± 0.084

0.624 ± 0.022

Text

0.6861 ± 0.027

0.623 ± 0.042

0.6951 ± 0.058

0.6545 ± 0.022

0.7593 ± 0.024

Audio \(+\) Time

0.5897 ± 0.009

0.5631 ± 0.034

0.3554 ± 0.07

0.4306 ± 0.054

0.6163 ± 0.01

Text \(+\) Time

0.6683 ± 0.029

0.6197 ± 0.043

0.7011 ± 0.137

0.6494 ± 0.032

0.7353 ± 0.047

Audio \(+\) Text

0.6052 ± 0.014

0.5669 ± 0.024

0.5255 ± 0.152

0.534 ± 0.082

0.6482 ± 0.014

Audio \(+\) Text \(+\) Time

0.6257 ± 0.065

0.5724 ± 0.075

0.5239 ± 0.151

0.5415 ± 0.107

0.6769 ± 0.081

  1. Results of the best-performing modality for each metric are in bold.