Table 5 Metrics for binary prediction of co-citation between two input abstracts via cosine similarity for the skin cancer evaluation set, sorted by \(F1_{max}\). Threshold refers to the optimal decision cutoff using the cosine similarities of that dataset. Models trained in this work are highlighted in bold.

From: Contrastive learning and mixture of experts enables precise vector embeddings in biological databases

Model

F1

Precision

Recall

Threshold

Ratio

ROC-AUC

SE\(_{{{\varvec{cancer}}}}\)

0.8509

0.8308

0.8720

0.6680

1.4538

0.9203

MoE\(_{{{\varvec{all}}}}\)

0.7687

0.7130

0.8339

0.8505

1.1447

0.8226

SE\(_{{{\varvec{all}}}}\)

0.7301

0.6070

0.9158

0.7867

1.1066

0.7724

Llama-3.2-1B

0.6856

0.5891

0.8201

0.8249

1.0442

0.6863

SciBERT

0.6793

0.5289

0.9493

0.8387

1.0176

0.6392

E5\(_{large}\)

0.6769

0.5455

0.8916

0.7891

1.0272

0.6890

E5\(_{base}\)

0.6759

0.5404

0.9020

0.7970

1.0270

0.6838

SE\(_{{{\varvec{copd}}}}\)

0.6756

0.5291

0.9343

0.6925

1.0549

0.6214

TF-IDF

0.6747

0.5284

0.9331

0.0437

1.5295

0.6752

RoBERTa\(_{large}\)

0.6747

0.5165

0.9723

0.9938

1.0006

0.6518

ModernBERT\(_{large}\)

0.6735

0.5130

0.9804

0.9086

1.0076

0.6162

BioBERT

0.6733

0.5245

0.9400

0.9268

1.0077

0.6399

BERT\(_{large}\)

0.6723

0.5291

0.9216

0.8729

1.0155

0.6300

PubmedBERT

0.6716

0.5189

0.9516

0.9799

1.0026

0.6289

MPNet

0.6711

0.5086

0.9862

0.2200

1.1740

0.6545

SE\(_{{{\varvec{cvd}}}}\)

0.6701

0.5129

0.9666

0.6468

1.0530

0.6317

BERT\(_{base}\)

0.6693

0.5122

0.9654

0.8139

1.0190

0.6223

Mini

0.6680

0.5147

0.9516

0.2560

1.1917

0.6682

RoBERTa\(_{base}\)

0.6680

0.5032

0.9931

0.9759

1.0012

0.5989

SE\(_{{{\varvec{cancer}}}}\)

0.6677

0.5014

0.9988

0.4490

1.0502

0.6236

ModernBERT\(_{base}\)

0.6677

0.5143

0.9516

0.9273

1.0062

0.6150

SE\(_{{{\varvec{autoimmune}}}}\)

0.6669

0.5003

1.0000

0.4453

1.0659

0.6545