Table 2 Hyperparameters for BERT-based assertion detection model training

From: Extracting post-acute sequelae of SARS-CoV-2 infection symptoms from clinical notes via hybrid natural language processing

Model

Hyperparameters

BioBERT

The following hyperparameters were used during training (provided on HuggingFace):

 • learning_rate: 2e-05

 • train_batch_size: 8

 • eval_batch_size: 8

 • seed: 42

 • optimizer: Adam with betas = (0.9,0.999) and epsilon=1e-08

 • lr_scheduler_type: linear

 • num_epochs: 10

ClinicalBERT

The following hyperparameters were used during training (provided on HuggingFace):

 • batch_size: 32

 • Maximum sequence length: 256

 • Learning rate: 5e-5

BiomedBERT

The following hyperparameters were used during training (provided in paper):

 • Optimizer: Adam

 • Learning Rate Schedule: Slanted triangular learning rate schedule with warm-up in 10% of steps and cool-down in 90% of steps.

 • Peak Learning Rate: 6 × 10−4

 • Training Steps: 62,500 steps

 • Batch Size: 8,192

 • Masking Rate for Whole-Word Masking (WWM): 15%