"Lengths" of Random Variables in Infinite Dimensional Spaces

Question

Consider a measure space $(S, Sigma, mu)$ and the normed vector space $mathcal{L}^2(mu)$. Then for any measurable function $f: S to mathbb{R}$ with $f in mathcal{L}^2(mu)$ the norm is defined as
$$
||f||_{mathcal{L}^2(mu)} := left(int_S f^2(x) , mu(dx)right)^{1/2},
$$
and, as I understand it, this is exactly analogous to the “length” of $f$, just as the Euclidean norm is the “length” of a vector in $mathbb{R}^n$.

Now consider a probability space $(Omega, mathcal{F}, P)$ and the normed vector space $mathcal{L}^2(P)$. Let $X: Omega to mathbb{R}$ be a random variable. Then for any such $X in mathcal{L}^2(P)$, we define the norm of $X$
$$
||X||_{mathcal{L}^2(P)} := left( int_{Omega} X^2(omega) , P(domega)right)^{1/2} = left(Eleft(X^2right)right)^{1/2},
$$
but this is not the “length” of $X$ unless $E(X) = 0$. Instead, we usually think of the standard deviation as the length of $X$ by subtracting $mu := E(X)$ from $X$ first:
$$
std(X) := left(Eleft(left(X – muright)^2right)right)^{1/2}.
$$
This brings up two questions:

Why must we subtract $mu$ to interpret this as the “length”?
I’m thinking of $mathcal{L}^2(P)$ as a vector space over $mathbb{R}$. Since $mu in mathbb{R}$, what does $X – mu$ mean? In Euclidean space, it doesn’t make sense to write $vec{x} – c$ for $vec{x} in mathbb{R}^n$ and $c in mathbb{R}$.

Update I want to be able to apply the intuition of Euclidean geometry to visualize things like the correlation coefficient $rho$ as the cosine of the angle between two random variables $X$ and $Y$, as explained here. In $mathbb{R}^n$ the cosine of the angle between two vectors $vec{x}$ and $vec{y}$ is related to their inner product and lengths by
$$
cos theta = frac{left<vec{x}, vec{y}right>}{||vec{x}||cdot||vec{y}||}.
$$
If $||X||_{mathcal{L}^2(P)}$ were indeed the “length” of $X$, then I feel like I should just be able to change the norms and inner products, but I can’t because
$$
cos theta = frac{left<X,Yright>_{mathcal{L}^2(P)}}{||X||_{mathcal{L}^2(P)}cdot||Y||_{mathcal{L}^2(P)}} neq frac{left<X – mu_X, Y – mu_Yright>_{mathcal{L}^2(P)}}{||X – mu_X||_{mathcal{L}^2(P)}cdot||Y – mu_Y||_{mathcal{L}^2(P)}} = frac{Cov(X,Y)}{std(X)cdot std(Y)} = rho
$$

Update with picture Here’s my intuition for conditional expectations in $mathcal{L}^2$, as requested.

measure theory probability theory

baibo · Answer

I think @ian answered the question in the comments to OP's post, but here I will rephrase it:
Since the covariance map $ cov(X,Y) to mathbb{E}[(X - mathbb{E}[X])(Y - mathbb{E}[Y])]$ is actually an inner product as shown here, it induces the norm which is precisely the standard deviation map $ sd(X) to (mathbb{E}[(X - mathbb{E}[X])^2])^{frac{1}{2}}$, which in turn induces a metric etc. As someone already explained in another answer, for your geometric intuition: random variables for which $mathbb{E}[X] ne 0$ can be though of as "translated" versions of their $0$-mean counterparts, which is why the $mathbb{E}[X]$ "creeps in" to the formula above.

theoGR · Answer

Let $V$ be a vector space. Examples of $V$ are, $V=mathcal{R}^n$, the n-dimensional space of reals or $V=L^2(mu)={f:int f^2 dmu;=int (X-E(X))(Y-E(Y))dP$. Notice that this inner product can produce the $L_2(P)$ norm as special case for $X=Y,;E(X)=0$. $$;=int X^2 dP=vertvert Xvertvert_2^2$$ This is the known formula $Cov(X,X)=Var(X)$. In general norms in $L_2(P)$ spaces cannot be produced from an inner product.

Wavelet · Answer

Length is not really the best way to think about norms in function spaces. It is much better to think of the norm as offering a vague notion of size. Thinking of the norm in function spaces as length just connotes too much geometric intuition where there is none. I think you are falling into a common trap when starting out in functional analysis, which is to try to interpret everything geometrically. I'll just say that this is a very problematic way to approach the subject and will cause you more headaches than it will insight.

The reason that the norm is always taken against the zero vector by default is because in vector spaces, the length of vectors is translation invariant. For example, in $L^{2}(mathbb{R})$, we take the norm against the zero function by default
$$
lvertlvert f rvertrvert_{L^{2}}=lvertlvert f-0rvertrvert=left(int_{E subseteq mathbb{R}} lvert f-0 rvert^{2},  dmu right)^{1/2}
$$
but we could just as easily pick another function, say $g in L^{2}(mathbb{R})$ and write
$$
lvertlvert f-g rvertrvert_{L^{2}}=left(int_{E subseteq mathbb{R}} lvert f -g rvert^{2},  dmu right)^{1/2}
$$
which tells us how far apart these functions are in some sense (and there are many senses which do not coincide with your geometric intuition). Your standard deviation example just follows from the fact that you have a linear functional $mathbb{E}[cdot]:L^{2}(Omega,mathcal{F},mathbb{P}) to mathbb{R}$ composed with another linear functional $sigma[cdot]:L^{2}(Omega,mathcal{F},mathbb{P}) to mathbb{R}$ given by
$$
sigma[X]=(mathbb{E}[(X-mathbb{E}[X])^{2}])^{1/2}
$$
and this has nothing to do with how vector spaces work, it is just because of how $sigma[X]$ is defined. It is the "distance" (or how much "size" is between them) from the constant function whose value is given by the functional $mathbb{E}[X]$.

As for your second question, like I said before, our brains are limited to visualizing in 3 dimensions. Trying to "visualize" in an infinite dimensional Hilbert space simply isn't going to work.

"Lengths" of Random Variables in Infinite Dimensional Spaces

3 Answers

Add your own answers!

Ask a Question