What's the probability the cumulative average of multiple gaussian variables exceed a certain value?

Question

Consider a number $n$ of independent, normally distributed variables $Y_1$, $Y_2$, ..., $Y_n$. Consider also the $n$ variables defined by the average of the last $h$ terms, $X_h = frac{sum^{n}_{k = n -h + 1} Y_k}{h}$ for $h in {1,.., n}$. The variables $sqrt{h}X_h$ are normally distributed. I'd like to understand what is the probability of having $P{sqrt{h}X_h > T text{ or } sqrt{h-1}X_{h-1} > T text{ or .. or } X_1>T }$, in words having one or more of the $sqrt{h}X_h$ to output above a threshold value. Do you have any reference or suggestions about this problem? I can work out the $n = 2$ case through geometry working on the expression $P{Y_1 + Y_2 > sqrt{2}T , |, Y_1

ConMan · Answer

I'm going to do all the calculations assuming you're taking the average of the first $h$ variables rather than the last, because it makes the algebra a little neater without really changing the results. Also, I'm going to assume you always look at the full sequence of $n$ variables, since you can always just drop off the ones you're not working with and set $n = h$ to get the same effect.
The first step is to flip that probability around. Instead of looking at "the probability that at least one is greater", instead look at "the probability that all of them are less", i.e. $P(X_1 leq T land X_2 leq T land ldots land X_h leq T)$, because these two probabilities are complements and hence will add to 1, but this one is easier to work with.
Second, let's write everything in vectors.
$mathbf{X} = begin{bmatrix} X_1 \ X_2 \ vdots \ X_n end{bmatrix} =
begin{bmatrix} 1 & 0 & 0 & ldots & 0 \
frac{1}{2} & frac{1}{2} & 0 & ldots & 0 \
& & &ddots \
frac{1}{n} & frac{1}{n} & frac{1}{n} & ldots & frac{1}{n} end{bmatrix} begin{bmatrix} Y_1 \ Y_2 \ vdots \ Y_n end{bmatrix} = mathbf{MY} $
So our probability becomes $P(mathbf{X} leq Tmathbf{1}_n) = P(mathbf{MY} leq T mathbf{1}_n) = P(mathbf{Y} leq Tmathbf{M}^{-1}mathbf{1}_n)$.
So now we just have to invert that matrix. I am going to propose, and suggest that you prove for yourself, that:
$mathbf{M}^{-1} = begin{bmatrix} 1 & 0 & 0 & ldots & 0 & 0 \
-1 & 2 & 0 & ldots & 0 & 0 \
0 & -2 & 3 & ldots & 0 & 0 \
 & & & ddots \
0 & 0 & 0 & ldots & -(n-1) & n end{bmatrix}$
Which, neatly, means that $mathbf{M}^{-1} mathbf{1}_n = mathbf{1}_n$, and so our probability is literally just $P(mathbf{X} leq Tmathbf{1}_n) = P(mathbf{Y} leq Tmathbf{1}_n) = P(Y_1 leq T land Y_2 leq T land ldots land Y_n leq T) = P(Y_1 leq T)P(Y_2 leq T) ldots P(Y_n leq T) = Phi(T)^n$ because the $Y_i$ are all independent.
So, the probability that all of the averages are less than the threshold is $Phi(T)^n$, meaning that the probability that at least one crosses the threshold is $1 - Phi(T)^n$.

What's the probability the cumulative average of multiple gaussian variables exceed a certain value?

One Answer

Add your own answers!

Ask a Question