Once learning is done, the weights should be copied back into normal tensors.
Once learning is done, the weights should be copied back into normal tensors.
The weights
Some optimizers swap out weights with special purpose tensors for e.
Some optimizers swap out weights with special purpose tensors for e.g. efficient scoring while learning.
The weights
Whether the optimizer has converged yet.
Whether the optimizer has converged yet.
Reset the optimizers internal state (such as Hessian approximation, etc.
Reset the optimizers internal state (such as Hessian approximation, etc.)
Updates the weights according to the gradient.
Updates the weights according to the gradient.
The weights
The gradient
The value
This implements an efficient version of the Pegasos SGD algorithm for l2-regularized hinge loss it won't necessarily work with other losses because of the aggressive projection steps note that adding a learning rate here is nontrivial since the update relies on baseRate / step < 1.0 to avoid zeroing the weights but if I don't add a rate <1 here this optimizer does terribly in my tests -luke