Table 1 Comparison of GatorTron with existing biomedical and clinical transformer models for clinical concept extraction and medical relation extraction.

From: A large language model for electronic health records

Transformer

Clinical concept extraction

Medical relation extraction

2010 i2b239

2012 i2b240

2018 n2c241

2018 n2c241

Precision

Recall

F1 score

Precision

Recall

F1 score

Precision

Recall

F1 score

Precision

Recall

F1 score

BioBERT

0.8693

0.8653

0.8673

0.7478

0.8037

0.7747

0.8634

0.8921

0.8775

0.9663

0.9451

0.9555

ClinicalBERT

NA

NA

0.8780

NA

NA

0.7890

0.8592

0.8832

0.8710

0.9678

0.9414

0.9544

BioMegatron

0.8614

0.8761

0.8687

0.7591

0.8031

0.7805

0.8707

0.8915

0.8810

0.9711

0.9434

0.9571

GatorTron-base (1/4 data)

0.8682

0.9046

0.8860

0.7514

0.8013

0.7755

0.8772

0.8992

0.8881

0.9724

0.9457

0.9589

GatorTron-base

0.8748

0.9043

0.8893

0.7644

0.8221

0.7922

0.8759

0.9038

0.8896

0.9719

0.9482

0.9599

GatorTron-medium

0.8869

0.9122

0.8994

0.7812

0.8245

0.8022

0.8954

0.9035

0.8994

0.9721

0.9503

0.9611

GatorTron-large

0.8880

0.9116

0.8996

0.7862

0.8333

0.8091

0.8979

0.9021

0.9000

0.9776

0.9482

0.9627

  1. Clinical concepts in 2010 i2b2 and 2012 i2b2 challenges: problems, treatments, lab tests; clinical concepts in 2018 n2c2 challenge: drugs, adverse events, and drug-related attributes (e.g., dose). Medical relation in 2018 n2c2 challenge: drug induced adverse events. Best F1 scores are presented in bold. NA: scores not reported.