Why does reinforcement learning using a non-linear function approximator diverge when using strongly correlated data as input?

Question

While reading the DQN paper, I found that randomly selecting and learning samples reduced divergence in RL using a non-linear function approximator (e.g a neural network).
So, why does Reinforcement Learning using a non-linear function approximator diverge when using strongly correlated data as input?

David Ireland · Answer

It is not so much the problem of using Reinforcement Learning to train the neural networks, it is the assumptions made about the data given to standard Neural Networks. They are not capable of handling strongly correlated data which is one of the motivations for introducing Recurrent Neural Networks, as they can handle this correlated data well.

Answered by David Ireland on December 13, 2021

Why does reinforcement learning using a non-linear function approximator diverge when using strongly correlated data as input?

One Answer

Add your own answers!

Ask a Question