# QQ plot comparison of z-normalized datasets

I want to make sure I’m correct in my assumptions.
I’m predicting financial returns by using different ML models. There are 4500 values in each dataset.
The density plots of all models are shown like this:

I clipped the x-axis at [-0.025, 0.025].
Now I wanted to see if the predicted values of the models have the same distribution as my observed values from the “sp500”.
I used z-normalization on the datasets to make them comparable.

For my Non-NN models, these QQ plots were created:

Imho, it seems that all Non-NN models have the same distribution as the observed dataset.

For my NN-models, these QQ plot were created:

Here it seems that both, especially the LSTM model, have distributions that differ from the observed dataset.

Is it correct to assume that the NN models have less predictive value than the Non-NN models as their distributions differ from the observed dataset?

Cross Validated Asked by PrinzvonK on December 30, 2020

I don't think a QQ plot is a good way to answer your last question. QQ plots compare quantiles -- which isn't what you want -- and you've normalized your data sets -- which isn't what you want either.

I applaud looking at accuracy graphically. But i would compare predicted to actual values (unnormalized) by

a) Using a scatter plot of one vs. the other for each model

b) Using a Tukey mean difference plot (aka Bland Altman plot - it's such a great idea that several brilliant people invented it)

c) A density plot and maybe a box plot of the differences.

Answered by Peter Flom on December 30, 2020

## Related Questions

### How much additional information should be provided if approximating a distribution? (KL Divergence)

1  Asked on December 13, 2021 by delta-divine

### Linear Discriminant Analysis’ predictions newbie question

1  Asked on December 13, 2021 by sendilab

### Prediction model of test scores based on subjective assessment

1  Asked on December 13, 2021 by romsch

### Circular variance of a mixture of Von Mises distributions

0  Asked on December 13, 2021 by ronald-van-den-berg

### Silhouette Score not robust when clustering time series with tslearn

0  Asked on December 13, 2021 by bk_

### Overview of the main methods to prune decision trees

1  Asked on December 13, 2021

### Order and interpretation of Gaussian Mixture Model with strong overlap between components

1  Asked on December 13, 2021 by antifrax

### Statistical data vs Precise data

0  Asked on December 11, 2021

### Difference-in-difference in panel data

1  Asked on December 11, 2021 by user30474

### Can you please explain Simpson’s paradox with equations, instead of contingency tables?

2  Asked on December 11, 2021

### How do I measure impact of an intervention of time series without historical data of the same series?

1  Asked on December 11, 2021 by pheno

### How do zeroes impact regression estimates?

1  Asked on December 11, 2021

### Can adding a random intercept change the fixed effect estimates in a regression model?

1  Asked on December 11, 2021

### How to derive this proportional conditional probability

1  Asked on December 11, 2021

### What kind of regression model would best suit this scenario?

1  Asked on December 11, 2021

### calling scores “changing” NMDS values from envfit() and R2 and P values

1  Asked on December 11, 2021

### How to calculate log likelihood for a model which outputs log probability?

0  Asked on December 11, 2021 by joff

### Hypothesis testing on cointegration vector

0  Asked on December 11, 2021 by meenakshi-s

### How to decide between individual versus group level random effect

1  Asked on December 11, 2021

### Binary Classification with almost no positives

1  Asked on December 11, 2021 by epsilondelta

### Ask a Question

Get help from others!