In recurrent neural networks (RNN), for example in the paper: Sequence to Sequence Learning with Neural Networks, it says that RNN language model is conditioned on the input sequence on line 7 in paragraph 3 in the Introduction.
So, what is the concept of conditioning in RNN?
"Conditioning" in the context of sequence to sequence learning in RNNs is the process of computing the probability of obtaining the output sequence conditioned on the input sequence, or p(y|x)
. The network is used to model this conditional probability mapping.
A technique to expedite training in sequence to sequence learning is known as teacher forcing, where the hidden states of neurons in adjacent timesteps are decoupled(see image). The ground truth label, y(t-1), in conjunction with input sequence element x(t-1) are used as input to the neuron in the subsequent timestep. Teacher forcing eliminates the need for Backpropagation Through Time, and parallelizes training using fewer computational resources. Unfortunately, some empirical results indicate that RNNs that employ teacher forcing are less robust to generalization error when compared with "vanilla" RNNs.
Edit: The image also includes the conditional probability distribution that teacher forcing in sequence to sequence RNNs approximates.