2018년 8월 19일 일요일

Teacher forcing

Teacher forcing is a strategy for training recurrent neural networks that uses model output from a prior time step as an input.

... the decoder learns to generate targets[t+1...] given targets[...t]conditioned on the input sequence.


2018년 8월 15일 수요일


Understanding exponentially weighted averages

from Understanding Exponentially Weighted Averages (C2W2L04)


So that's RMSprop, and similar to momentum, has the effects of damping out the oscillations in gradient descent, in mini-batch gradient descent. And allowing you to maybe use a larger learning rate alpha. And certainly speeding up the learning speed of your algorithm.

from Andrew Ng's lecture

2018년 8월 10일 금요일