Table 3 Comparison of GatorTronS with existing transformer-based LLMs for clinical concept extraction and medical relation extraction.
From: A study of generative large language model for medical research and healthcare
Clinical concept extraction | Medical relation extraction | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 i2b220 | 2012 i2b221 | 2018 n2c222 | 2018 n2c222 | |||||||||
Transformer | Precision | Recall | F1 score | Precision | Recall | F1 score | Precision | Recall | F1 score | Precision | Recall | F1 score |
ClinicalBERT | NA | NA | 0.878 | NA | NA | 0.789 | 0.859 | 0.883 | 0.871 | 0.968 | 0.941 | 0.954 |
GatorTron, 90B | 0.875 | 0.904 | 0.889 | 0.764 | 0.822 | 0.792 | 0.876 | 0.904 | 0.890 | 0.972 | 0.948 | 0.960 |
GatorTronS, 1B | 0.874 | 0.907 | 0.890 | 0.753 | 0.812 | 0.781 | 0.871 | 0.892 | 0.882 | 0.971 | 0.945 | 0.958 |
GatorTronS, 5B | 0.879 | 0.909 | 0.894 | 0.777 | 0.823 | 0.799 | 0.899 | 0.903 | 0.901 | 0.974 | 0.949 | 0.962 |
GatorTronS, 10B | 0.882 | 0.911 | 0.896 | 0.765 | 0.823 | 0.793 | 0.887 | 0.904 | 0.895 | 0.974 | 0.950 | 0.962 |
GatorTronS, 20B | 0.889 | 0.911 | 0.899 | 0.784 | 0.836 | 0.809 | 0.892 | 0.907 | 0.900 | 0.975 | 0.947 | 0.961 |