Table 4 Results using the augmented versions of the data with short sentences removed.

From: Multimodal deep learning for dementia classification using text and audio

Shorts-augmented

Accuracy

Precision

Recall

F1-score

AUROC

Audio

0.5954 ± 0.001

0.5989 ± 0.025

0.1939 ± 0.029

0.2914 ± 0.03

0.6258 ± 0.008

Text

0.841 ± 0.01

0.8212 ± 0.014

0.8089 ± 0.024

0.8148 ± 0.014

0.9276 ± 0.009

Audio \(+\) Time

0.6254 ± 0.005

0.5894 ± 0.023

0.4298 ± 0.05

0.4951 ± 0.032

0.6692 ± 0.008

Text \(+\) Time

0.8478 ± 0.003

0.8375 ± 0.02

0.8039 ± 0.207

0.8199 ± 0.007

0.9345 ± 0.005

Audio \(+\) Text

0.835 ± 0.02

0.8154 ± 0.036

0.7982 ± 0.039

0.80591 ± 0.023

0.9216 ± 0.02

Audio \(+\) Text \(+\) Time

0.8451 ± 0.003

0.83646 ± 0.027

0.7992 ± 0.028

0.8166 ± 0.04

0.931 ± 0.005

  1. Results of the best-performing modality for each metric are in bold.