Table 3 Performance comparison of unimodal and LLMMs in new-onset T2DM prediction using different modality combinations of A, B, and C.

From: Large language multimodal models for new-onset type 2 diabetes prediction using five-year cohort electronic health records

Architecture

Modality

Model

Accuracy

Recall

Precision

F1-score

Machine Learning Classifier

C

Logistc Regeression

0.79

0.79

0.79

0.73

KNN

0.83

0.83

0.81

0.81

K-Means

0.83

0.83

0.82

0.81

SVM

0.78

0.78

0.80

0.71

Random Forest

0.86

0.86

0.85

0.85

XGboost

0.86

0.86

0.85

0.85

CatBoost

0.86

0.86

0.85

0.85

DNN (Three-layer)

0.85

0.53

0.74

0.61

Unimodal

A

BiomedBERT

0.65

0.65

0.66

0.65

ClinicalBERT

0.61

0.61

0.61

0.61

SciFive

0.66

0.66

0.66

0.66

RoBERTa

0.65

0.65

0.65

0.65

Flan-T5-base-220M

0.62

0.62

0.67

0.62

Flan-T5-large-770M

0.79

0.79

0.78

0.79

BERT

0.66

0.66

0.66

0.66

GPT-2

0.78

0.78

0.77

0.77

Unimodal

A+B

BiomedBERT

0.82

0.82

0.82

0.82

ClinicalBERT

0.78

0.78

0.79

0.77

SciFive

0.76

0.76

0.76

0.76

RoBERTa

0.83

0.83

0.83

0.83

Flan-T5-base-220M

0.81

0.81

0.82

0.81

Flan-T5-large-770M

0.93

0.93

0.93

0.93

BERT

0.79

0.79

0.79

0.79

GPT-2

0.93

0.93

0.93

0.93

LLMMs

A+C

BiomedBERT

0.81

0.81

0.80

0.80

ClinicalBERT

0.81

0.81

0.80

0.80

SciFive

0.81

0.81

0.80

0.80

RoBERTa

0.80

0.80

0.79

0.80

Flan-T5-base-220M

0.80

0.80

0.80

0.80

Flan-T5-large-770M

0.81

0.81

0.81

0.81

BERT

0.82

0.82

0.80

0.81

GPT-2

0.81

0.81

0.79

0.81

LLMMs

(A+B) + C

BiomedBERT

0.93

0.93

0.93

0.93

ClinicalBERT

0.90

0.90

0.90

0.91

SciFive

0.93

0.93

0.93

0.93

RoBERTa

0.92

0.92

0.92

0.92

Flan-T5-base-220M

0.93

0.93

0.93

0.93

Flan-T5-large-770M

0.92

0.92

0.93

0.92

BERT

0.93

0.93

0.93

0.93

GPT-2

0.92

0.92

0.92

0.92