Table 3 Results using the augmented versions of the original data.

From: Multimodal deep learning for dementia classification using text and audio

Original-augmented

Accuracy

Precision

Recall

F1-score

AUROC

Audio

0.6038 ± 0.036

0.5846 ± 0.021

0.1831 ± 0.04

0.2764 ± 0.044

0.6336 ± 0.003

Text

0.8294 ± 0.005

0.8339 ± 0.022

0.751 ± 0.042

0.7892 ± 0.013

0.9208 ± 0.002

Audio \(+\) Time

0.6306 ± 0.068

0.5994 ± 0.012

0.3673 ± 0.044

0.4542 ± 0.033

0.6759 ± 0.008

Text \(+\) Time

0.8344 ± 0.006

0.8267 ± 0.035

0.7721 ± 0.043

0.797 ± 0.009

0.9236 ± 0.005

Audio \(+\) Text

0.825 ± 0.013

0.7978 ± 0.039

0.7859 ± 0.056

0.7899 ± 0.019

0.9124 ± 0.015

Audio \(+\) Text \(+\) Time

0.8315 ± 0.014

0.8212 ± 0.034

0.767 ± 0.012

0.7927 ± 0.014

0.9177 ± 0.014

  1. Results of the best-performing modality for each metric are in bold.