How can a learning rate that is too large cause the output of the network (and the error) to go to infinity?

Artificial Intelligence Asked by user1477107 on August 24, 2021

It happened to my neural network, when I use a learning rate of <0.2 everything works fine, but when I try something above 0.4 I start getting "nan" errors because the output of my network keeps increasing.

From what I understand, what happens is that if I choose a learning rate that is too large, I overshoot the local minimum. But still, I am getting somewhere, and from there I’m moving in the correct direction. At worst my output should be random, I don’t understand what is the scenario that causes my output and error to approach infinity every time I run my NN with a learning rate that is too large (and it’s not even that large)

How does the red line go to infinity ever? I kind of understand it could happen if we choose a crazy high learning rate, but if the NN works for 0.2 and doesn’t for 0.4, I don’t understand that

learning rate mean squared error

Add your own answers!

Ask a Question

Get help from others!

Recent Answers

Jon Church on Why fry rice before boiling?
haakon.io on Why fry rice before boiling?
Lex on Does Google Analytics track 404 page responses as valid page views?
Joshua Engel on Why fry rice before boiling?
Peter Machado on Why fry rice before boiling?