Hyperparameters

Details

Loss Function

Binary Cross Entropy Loss

Optimizer

Adam

Initial Learning Rate

1e−3

Learning Rate Scheduler

Geometric Decay, step size = 10, multiplier = 0.85

Minibatch Size

128

Epochs

100

Parameter Regularization

L2 Weight Decay, λ = 0.002