Table 2 Fine-tuning parameters.
Lora | Fine-tuning process | ||
|---|---|---|---|
Name | Value | Name | Value |
Lora rank | 8 | Optimizer | Adamw |
Lora alpha | 16 | Train epoch | 1 |
Lora dropout | 0.1 | Learning rate | 0.0002 |
Lora attention dimension | 64 | Loss function | Cross entropy |
Task type | CAUSAL_LM | Maximum gradient norm | 0.35 |
- | - | Warmup ratio | 0.03 |