TransWikia.com

What's the probability the cumulative average of multiple gaussian variables exceed a certain value?

Mathematics Asked by deppep on January 19, 2021

Consider a number $n$ of independent, normally distributed variables $Y_1$, $Y_2$, …, $Y_n$.
Consider also the $n$ variables defined by the average of the last $h$ terms, $X_h = frac{sum^{n}_{k = n -h + 1} Y_k}{h}$ for $h in {1,.., n}$. The variables $sqrt{h}X_h$ are normally distributed.

I’d like to understand what is the probability of having $P{sqrt{h}X_h > T text{ or } sqrt{h-1}X_{h-1} > T text{ or .. or } X_1>T }$, in words having one or more of the $sqrt{h}X_h$ to output above a threshold value.

Do you have any reference or suggestions about this problem?
I can work out the $n = 2$ case through geometry working on the expression $P{Y_1 + Y_2 > sqrt{2}T , |, Y_1<T }$. However I soon get lost into integral hell when dimension increases.

Thank you


EDIT:

I add my solution when $n = 2$. As remarked by @ConMan, the probability we are looking for is equal to one minus the probability for all $sqrt{h}X_h$ to output under threshold $T$, so:

$$ P = 1 – p(Y_1 + Y_2 < sqrt{2}T,|,Y_1< T)p(Y_1<T) $$

We can calculate the term $ p(Y_1 + Y_2 < sqrt{2}T,|,Y_1< T) $ geometrically (see picture), through:

$$ p(Y_1 + Y_2 < sqrt{2}T,|,Y_1< T) = int_{-infty}^{sqrt{2}T -T} text{d}x N(x) + int^{infty}_{sqrt{2}T -T}text{d}x N(x)Big[ 1 – int^{T}_{sqrt{2}T -x}text{d}y N(y)Big]=\
= 1 – int^{infty}_{sqrt{2}T -T}text{d}x N(x)Big( F(T) – F(sqrt{2} T – x) Big)=\
= 1 – Phi(T)big(1 – Phi(sqrt{2}T – T)big) + int^{infty}_{sqrt{2}T -T}text{d}x N(x) Phi(sqrt{2} T – x) $$

Naming $H(T)$ the last integral term, the probability we are looking for become:

$$P = 1 – Phi(T)Big( 1 -Phi(T)big(1-Phi(sqrt{2}T – T)big) + H(T) Big)$$

If $T = 3$, we find a probability $P sim 2.5%$ which implies a value larger than the probability $1 – Phi(T)$ by a factor $frac{P(T)}{1-Phi(T)} sim 1.823$.

Where we noted with $Phi$ and $N$ the normal partition and probability density function respectively.
This seems confirmed by simulations, in python:

import scipy.stats as sts
import numpy as np

N = 1000000
M = 2
host_matrix = np.zeros((N,M))

for n in range(N):
    arr = sts.norm(0).rvs(M)
    wind = np.cumsum(arr)/np.sqrt(np.arange(M) + 1)
    host_matrix[n] = wind

Here a routine for calculating the value of the formula above:

def fp_prb(thr):
    import scipy.integrate as integrate
    def hard_term_f(x, T):
        return sts.norm.pdf(x)*sts.norm.cdf(sqrt(2)*T - x)
    
    hard_term = integrate.quad(lambda x: hard_term_f(x,thr),thr*(sqrt(2) - 1),+np.inf)[0]
    F_t = F(thr)
    F_sqrt = F(thr*(sqrt(2)-1))
    
    prb = 1 - F_t*( 1 - F_t*(1 - F_sqrt) + hard_term )    
    return  prb

The observed frequency and probability ratio can be computed through:

thr = 3

no_fp = len(np.argwhere((host_matrix[:,0] < thr ) & (host_matrix[:,1] < thr )))
fp_freq =  1 - no_fp/N
norm_freq = 1 - sts.norm.cdf(thr)
print('FP frequency: {:.4f}'.format(fp_freq))
print('Normal detections frequency: {:.4f}'.format(norm_freq))
print('Ratio: {:.3f}'.format(fp_freq/norm_freq))

An example output:

FP frequency: 0.0025
Normal detections frequency: 0.0013
Ratio: 1.841

One Answer

I'm going to do all the calculations assuming you're taking the average of the first $h$ variables rather than the last, because it makes the algebra a little neater without really changing the results. Also, I'm going to assume you always look at the full sequence of $n$ variables, since you can always just drop off the ones you're not working with and set $n = h$ to get the same effect.

The first step is to flip that probability around. Instead of looking at "the probability that at least one is greater", instead look at "the probability that all of them are less", i.e. $P(X_1 leq T land X_2 leq T land ldots land X_h leq T)$, because these two probabilities are complements and hence will add to 1, but this one is easier to work with.

Second, let's write everything in vectors.

$mathbf{X} = begin{bmatrix} X_1 \ X_2 \ vdots \ X_n end{bmatrix} = begin{bmatrix} 1 & 0 & 0 & ldots & 0 \ frac{1}{2} & frac{1}{2} & 0 & ldots & 0 \ & & &ddots \ frac{1}{n} & frac{1}{n} & frac{1}{n} & ldots & frac{1}{n} end{bmatrix} begin{bmatrix} Y_1 \ Y_2 \ vdots \ Y_n end{bmatrix} = mathbf{MY} $

So our probability becomes $P(mathbf{X} leq Tmathbf{1}_n) = P(mathbf{MY} leq T mathbf{1}_n) = P(mathbf{Y} leq Tmathbf{M}^{-1}mathbf{1}_n)$.

So now we just have to invert that matrix. I am going to propose, and suggest that you prove for yourself, that:

$mathbf{M}^{-1} = begin{bmatrix} 1 & 0 & 0 & ldots & 0 & 0 \ -1 & 2 & 0 & ldots & 0 & 0 \ 0 & -2 & 3 & ldots & 0 & 0 \ & & & ddots \ 0 & 0 & 0 & ldots & -(n-1) & n end{bmatrix}$

Which, neatly, means that $mathbf{M}^{-1} mathbf{1}_n = mathbf{1}_n$, and so our probability is literally just $P(mathbf{X} leq Tmathbf{1}_n) = P(mathbf{Y} leq Tmathbf{1}_n) = P(Y_1 leq T land Y_2 leq T land ldots land Y_n leq T) = P(Y_1 leq T)P(Y_2 leq T) ldots P(Y_n leq T) = Phi(T)^n$ because the $Y_i$ are all independent.

So, the probability that all of the averages are less than the threshold is $Phi(T)^n$, meaning that the probability that at least one crosses the threshold is $1 - Phi(T)^n$.

Answered by ConMan on January 19, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP