It works by keeping a (reweighted) sum of the gradients seen so far and applying regularization at prediction time instead of update time.

Tuning the rate an delta parameters is often not necessary.

The regularizers, however, are per-example, which mean that their value should be set to be a very small number, on the order of 0.01/num_training_examples, and these values should be tuned.

Linear Supertypes
Ordering
1. Alphabetic
2. By inheritance
Inherited
3. AnyRef
4. Any
1. Hide All
2. Show all
Visibility
1. Public
2. All

### Instance Constructors

1. #### new AdaGradRDA(delta: Double = 0.1, rate: Double = 0.1, l1: Double = 0.0, l2: Double = 0.0, numExamples: Int = 1)

delta

A large value of delta slows the rate at which the learning rates go down initially

rate

The initial learning rate

l1

The strength of l1 regularization

l2

The strength of l2 regularization

numExamples

The number of examples for online training, used to scale regularizers

### Value Members

1. #### final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
2. #### final def !=(arg0: Any): Boolean

Definition Classes
Any
3. #### final def ##(): Int

Definition Classes
AnyRef → Any
4. #### final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
5. #### final def ==(arg0: Any): Boolean

Definition Classes
Any
6. #### final def asInstanceOf[T0]: T0

Definition Classes
Any
7. #### def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
8. #### val delta: Double

A large value of delta slows the rate at which the learning rates go down initially

9. #### final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
10. #### def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
11. #### def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
12. #### def finalizeWeights(weights: WeightsSet): Unit

Once learning is done, the weights should be copied back into normal tensors.

Once learning is done, the weights should be copied back into normal tensors.

weights

The weights

Definition Classes
13. #### final def getClass(): Class[_]

Definition Classes
AnyRef → Any
14. #### def hashCode(): Int

Definition Classes
AnyRef → Any
15. #### def initializeWeights(weights: WeightsSet): Unit

Some optimizers swap out weights with special purpose tensors for e.

Some optimizers swap out weights with special purpose tensors for e.g. efficient scoring while learning.

weights

The weights

Definition Classes

17. #### def isConverged: Boolean

Whether the optimizer has converged yet.

Whether the optimizer has converged yet.

Definition Classes
18. #### final def isInstanceOf[T0]: Boolean

Definition Classes
Any
19. #### val l1: Double

The strength of l1 regularization

20. #### val l2: Double

The strength of l2 regularization

21. #### final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
22. #### final def notify(): Unit

Definition Classes
AnyRef
23. #### final def notifyAll(): Unit

Definition Classes
AnyRef
24. #### val numExamples: Int

The number of examples for online training, used to scale regularizers

25. #### val rate: Double

The initial learning rate

26. #### def reset(): Unit

Reset the optimizers internal state (such as Hessian approximation, etc.

Reset the optimizers internal state (such as Hessian approximation, etc.)

Definition Classes
27. #### def step(weights: WeightsSet, gradient: WeightsMap, value: Double): Unit

weights

The weights

value

The value

Definition Classes
28. #### final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
29. #### def toString(): String

Definition Classes
AnyRef → Any
30. #### final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
31. #### final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
32. #### final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )