MSE relevance as a metric when errors

Question

I'm trying to build my first models for regression after taking MOOCs on deep learning. I'm currently working on a dataset whose labels are between 0 and 2. Again, this is a regression task, not classification.
The low y values imply that the loss for each sample is quite low, always < 1. My question is then about the relevance of mse as a metric in such a case : since the loss is < 1, squaring it will result in an even smaller value, making the metric value drop very rapidly. In this case, would it be more relevant to use mae ? Or should I multiply the y values so that the order of magnitude of a sample loss would be > 1.
I found this nice article about regression metrics, but didn't find the answer in it. Thanks for your help.

Suren · Accepted Answer

I'd use relative RMSE
$sqrt{frac{1}{n} sum frac{(Preicted - True)^2}{True^2}}$.
In this case, close to 0 implies a good model, regardless of the scale of the true values.
Similarly, you can try relative MAE.

Burger · Answer

If your only concern is small error values, why not simply scale the output by some constant?

The idea would be to multiply all the actual values by some factor e.g. 10*y_actual
Next, train your model on the scaled values.
To make a prediction in the orginal rang you would have to scale back the outputs: y_scale_orginal = y_prediction / 10

ombk · Answer

MSE and Standard deviation
Mean squared error, shows us how much error we have over all our points. Indeed the goal is to reduce it, however, in your case, the error yielded would already be small.
One way to understand the relevance of your (MSE) RMSE is to compare it to the standard deviation.
Imagine having a standard deviation lower than your learned model's RMSE, therefore, if you take the mean as a value for all your predictions (X_test), it would be a better answer than trying to predict the value using your estimator.
In other words, imagine using a naive regressor, that gives all your points the mean value. If this estimator is yielding less RMSE than your model that should have learned something, then your model is very bad since the naive estimator beats it.
Start from this logic...
I would love you to think of what I said, however, if you lose hope in figuring it out check this.
Why not use MAE
MAE has its own benefits, therefore, using it randomly is useless. MAE is mostly used when we are dealing with data that has outliers or noise, therefore, we want to try to not give much importance to those spikes in magnitude.

MSE vs. MAE (L2 loss vs L1 loss) In short, using the squared error is
easier to solve, but using the absolute error is more robust to
outliers. But let’s understand why!

Read here

MSE relevance as a metric when errors < 1

3 Answers

Add your own answers!

Ask a Question