Parameter | Parameter value |
Alpha | 0.7 |
Epochs | 60 |
Batch_Size | 128 |
Dropout | 0.5 |
Learning_Rate | 1e−3 |
Hidden_Size | 768 |
Max_Length | 128 |
Optimizer | AdamW |
LossFunction | Cross Entropy |
N_Gram | 4 |