Table 3 The suggested super parameters of pre-trained models.

From: Information extraction from green channel textual records on expressways using hybrid deep learning

Parameter

Value

Vector dimension of BERT

768

Maximum sentence length

172

Hidden layer dimension of GRU or LSTM

768

Optimizer

Adam

Learning rate

5 × 10− 5

Batch size

16

Maximum iteration rounds

3