time-series prediction : loss going down, then stagnates with very high variance

Question

I am trying to design a model based on LSTM cells to do time-series prediction. The ouput value is an integer in [0,13]. I have noticed that one-hot encoding it and using cross-entropy loss gives better results than MSE loss.

Here is my problem : no matter how deep I make the network or how many fully connected layers I add I always obtain pretty much the same behavior. Changing the optimizer also doesn't really help.

The loss function quickly decreases then stagnates with a very high variance and never goes down again.
The prediction seems to be offset around the value 9, I really do not understand why since I have one-hot encoded the input and the output.

Here is an example of a the results of a typical training phase, with the total loss :

Do you have any tips/ideas as to how I could improve this or could have gone wrong ? I am a bit of a beginner in ML si I might have missed something. I can also include the code (in PyTorch) if necessary.

Johncowk · Answer

I found the issue, I should have done more unit testing. Upon computing the batch loss before backpropagation, one of the dimension of the "prediction" tensor was not corresonding to the "truth" tensor. The shape match but the content is not the one that was supposed to be.
This is due to how the NLL loss is implemented in pytorch which I was not aware of ...

time-series prediction : loss going down, then stagnates with very high variance

One Answer

Add your own answers!

Ask a Question