Table 3 Model performance comparison

From: Smartphone video-based early diagnosis of blepharospasm using dual cross-attention modeling enhanced by facial pose estimation

METHOD

ViViT

VideoMAE

SA-RNViT

CA-RNViT

DCA-RNViT

SA-ViViT

CA-ViViT

DCA-ViViT

Diagnosis

Accuracy, Mean (SD)

0.488 (0.011)

0.485 (0.005)

0.591 (0.025)

0.613 (0.015)

0.62 (0.008)

0.666 (0.039)

0.655 (0.048)

0.674 (0.039)

Precision, Mean (SD)

0.29 (0.003)

0.29 (0.001)

0.563 (0.05)

0.591 (0.016)

0.614 (0.06)

0.686 (0.078)

0.66 (0.066)

0.696 (0.053)

Recall, Mean (SD)

0.377 (0.009)

0.374 (0.004)

0.624 (0.087)

0.626 (0.069)

0.597 (0.032)

0.636 (0.042)

0.62 (0.047)

0.641 (0.041)

F1 score, Mean (SD)

0.328 (0.005)

0.326 (0.002)

0.575 (0.053)

0.583 (0.014)

0.598 (0.039)

0.628 (0.048)

0.614 (0.046)

0.636 (0.023)

AUC, Mean (SD)

0.499 (0.008)

0.468 (0.01)

0.759 (0.039)

0.788 (0.026)

0.758 (0.026)

0.78 (0.042)

0.786 (0.03)

0.779 (0.035)

Severity

Accuracy, Mean (SD)

0.733 (0.031)

0.753 (0.016)

0.767 (0.022)

0.779 (0.023)

0.8 (0.021)

0.818 (0.019)

0.806 (0.023)

0.828 (0.008)

Precision, Mean (SD)

0.746 (0.041)

0.8 (0.043)

0.838 (0.027)

0.849 (0.035)

0.825 (0.015)

0.833 (0.029)

0.825 (0.021)

0.838 (0.019)

Recall, Mean (SD)

0.932 (0.005)

0.86 (0.015)

0.858 (0.028)

0.867 (0.073)

0.928 (0.036)

0.948 (0.021)

0.939 (0.02)

0.956 (0.027)

F1 score, Mean (SD)

0.829 (0.026)

0.828 (0.017)

0.848 (0.018)

0.855 (0.021)

0.873 (0.017)

0.886 (0.012)

0.878 (0.017)

0.892 (0.008)

AUC, Mean (SD)

0.57 (0.042)

0.528 (0.016)

0.739 (0.028)

0.792 (0.043)

0.739 (0.036)

0.785 (0.025)

0.789 (0.024)

0.785 (0.015)

Frequency

Accuracy, Mean (SD)

0.692 (0.035)

0.68 (0.022)

0.765 (0.024)

0.771 (0.019)

0.808 (0.034)

0.769 (0.026)

0.801 (0.035)

0.82 (0.058)

Precision, Mean (SD)

0.697 (0.049)

0.73 (0.041)

0.818 (0.029)

0.851 (0.019)

0.823 (0.048)

0.781 (0.025)

0.824 (0.028)

0.827 (0.052)

Recall, Mean (SD)

0.891 (0.025)

0.788 (0.012)

0.858 (0.018)

0.816 (0.046)

0.923 (0.027)

0.922 (0.035)

0.904 (0.036)

0.934 (0.028)

F1 score, Mean (SD)

0.791 (0.029)

0.758 (0.026)

0.837 (0.01)

0.832 (0.019)

0.868 (0.022)

0.845 (0.022)

0.862 (0.027)

0.877 (0.041)

AUC, Mean (SD)

0.645 (0.032)

0.51 (0.013)

0.775 (0.04)

0.789 (0.039)

0.77 (0.057)

0.738 (0.04)

0.786 (0.045)

0.782 (0.055)

  1. AUC area under the curve, ViViT video vision transformer, VideoMAE video masked autoencoder, SA self-attention, CA cross-attention, DCA dual cross-attention, RNViT ResNet video transformer.