Table 2 Summary of the models used in the study.

From: Contrastive learning and mixture of experts enables precise vector embeddings in biological databases

Model

Parameter count (millions)

Huggingface path

Llama-3.2

1236

meta-llama/Llama-3.2-1B

ModernBERT\(_{large}\)

395

answerdotai/ModernBERT\(_{large}\)

BERT\(_{large}\)

336

google-bert/bert\(_{large}\)-uncased

E5\(_{large}\)

335

intfloat/e5\(_{large}\)-v2

RoBERTa\(_{large}\)

335

FacebookAI/roberta\(_{large}\)

MoE\(_{all}\) (ours)

150 active, 384 total

GleghornLab/MoE\(_{all}\)-sentence

ModernBERT\(_{base}\)

149

answerdotai/ModernBERT\(_{base}\)

Roberta\(_{base}\)

125

FacebookAI/roberta\(_{base}\)

SciBERT

110

allenai/scibert_scivocab_uncased

BERT\(_{base}\)

110

google-bert/bert\(_{base}\)-uncased

E5\(_{base}\)

109

intfloat/e5\(_{base}\)-v2

PubmedBERT

109

microsoft/BiomedNLP-BiomedBERT\(_{base}\)-uncased-abstract-fulltext

MPNet

109

sentence-transformers/all-mpnet\(_{base}\)-v2

BioBERT

108

dmis-lab/biobert-v1.1

Mini

23

sentence-transformers/all-MiniLM-L6-v2