Relation between norm and variance of a random vector

Question

I have been normalizing vectors for my work and there are generally two methods that I have been following. I assumed both the methods are equivalent until I found out they are not. The two methods are described below.

Take the square of the norm of the vector and divide this value by its length. To normalize, divide the vector by the square root of the above obtained value. This corresponds to the operation given below for any vector $mathbf{x}$.
$$
E_{mathbf{x}} = frac{||mathbf{x}||^2}{|x|}hspace{2mm} text{and}hspace{2mm} hat{mathbf{x}} = frac{mathbf{x}}{sqrt{E_{mathbf{x}}}}
$$
$|cdot|$ refers to dimension of the argument.
Use the variance function in the math library and divide the vector by the square root of its variance. Equivalent operations for vector $textbf{x}$ are outlined below.
$$
text{var}_{textbf{x}} = mathbb{E}(textbf{x} - mathbb{E}textbf{x})^2 hspace{2mm} text{and}hspace{2mm} hat{mathbf{x}} = frac{mathbf{x}}{sqrt{text{var}_{mathbf{x}}}}
$$

The reason I believe both are equivalent is because in the norm case, we assume the origin to be at $textbf{0}$ and we add the square of the distances from origin to each entry in the vector. In the variance case we move the origin to the mean of the random variable and then add the square of the distances taking the mean as origin.
Now I try to implement these two in python and following are the results.
import numpy as np
a = np.random.randn(1000)
np.linalg.norm(a) ** 2 / 1000
1.006560252222734
np.var(a)
1.003290114164144

In these lines of code I generate 1000 length standard normal samples. Method 1 and method 2 give me equal values in this case. However when my samples have correlation, this is not the case.
import numpy as np
cov_matrix = 0.3*np.abs(1-np.eye(1000)) + np.eye(1000)
a = np.random.multivariate_normal(mean=np.zeros(1000), cov=cov_matrix)
np.linalg.norm(a) ** 2 / 1000
1.036685431728734
np.var(a)
0.6900743017090415

I generate 1000 length multivariate normal random vector having covariance matrix with 1's along diagonals and 0.3 in all other off-diagonal entries. This is where I am confused. Method 1 and method 2 return different values.
Why is this the case? Why do both methods return same values in i.i.d. case and different values when the vector is correlated? Thanks.

Michael · Answer

Your definitions are really vector-based operations implemented by matlab or Python (not the same as a probabilistic variance). So let me define them more clearly (I will use the matlab definitions, I assume Python definitions are the same).  You are dealing with $n$-dimensional random vectors $X=(X_1, ..., X_n)$.  Then:
Definition 1: $$E_X = frac{sum_{i=1}^n X_i^2}{n}$$
Definition 2: $$M_X = frac{1}{n}sum_{i=1}^n X_i$$
Definition 3: $$V_X = frac{sum_{i=1}^n (X_i-M_X)^2}{n-1}$$
Notice that $E_X, M_X, V_X$ are all random variables.
Observation 1:
If ${X_i}_{i=1}^{infty}$ are i.i.d. with mean $E[X_1]$ and second moment $E[X_1^2]$ then by the law of large numbers
$$ lim_{nrightarrowinfty} E_X = E[X_1^2] quad mbox{(with prob 1)}$$
So in the special case when ${X_i}_{i=1}^{infty}$ are i.i.d. Gaussian $N(0,1)$ then $E_Xrightarrow 1$ with prob 1.  Since $n=1000$ is "large" your numerical value $E_X=1.006560252222734$ makes sense. If you independently repeat the experiment you will get a new number for $E_X$ but, with high probability, it will still be very close to $1$. You would get similar results with ${X_i}_{i=1}^{infty}$ any i.i.d. variables with $E[X_1^2]=1$, not necessarily having Gaussian distribution.
Observation 2:
In the special case when ${X_i}_{i=1}^{infty}$ are i.i.d. Gaussian $N(0,1)$, then a surprising but classic result of statistics says that $(n-1)V_X$ has a  "chi-square" distribution with $n-1$ degrees of freedom.  In particular its mean is $n-1$ and its variance is $2(n-1)$.  So then $V_X = frac{(n-1)V_X}{(n-1)}$ has mean $1$ and variance $frac{2(n-1)}{(n-1)^2} = frac{2}{(n-1)} approx 0$ for large $n$.  So for $n=1000$ your numerical data $V_X = 1.003290114164144 approx 1$ makes sense. If you independently repeat the experiment you will get a different numerical value for $V_X$ but, with high probability, it will still be very close to 1.
When you remove the i.i.d. assumption, so that you make the ${X_i}$ values correlated, then $V_X$ does not have the same distribution. That is why you get different numerical results in that case.

Relation between norm and variance of a random vector

One Answer

Observation 1:

Observation 2:

Add your own answers!

Ask a Question