So that's RMSprop, and similar to momentum, has the effects of damping out the oscillations in gradient descent, in mini-batch gradient descent. And allowing you to maybe use a larger learning rate alpha. And certainly speeding up the learning speed of your algorithm.
from Andrew Ng's lecture
https://www.youtube.com/watch?v=_e-LFe_igno
댓글 없음:
댓글 쓰기