TransWikia.com

Bayesian inference on mean of statistic from population

Cross Validated Asked by Helmut on December 31, 2020

Suppose that a collection of time intervals $t_i$ have occurred, for $i=1,…,n$. These should be considered as samples from a population governed by some distribution. During these time intervals, some event occurs according to a Poisson process with constant known rate $lambda{}$ independently across all time intervals. My data consists of the counts $y_1,..,y_n$ of the events. For any $i$, I can calculate a posterior distribution $p(lambda{}t_i|y_i)$ if I have a prior gamma distribution on $t_i$, say $text{Gamma}(alpha_i,beta_i)$, by:
$$p(lambda{}t_i|y_i)=text{Gamma}(y_i+alpha_i, beta_i+1)$$
using standard methods for conjugate priors and then, by scaling of gamma distributions:
$$p(t_i|y_i)=text{Gamma}(y_i+alpha_i, (beta_i+1)lambda{})$$
My understanding is that if I had happened to have additional sets of counts data relating to the same $t_i$, I could use Bayesian updating to improve this posterior distribution with these additional data sets.

But what I have is data ${y_i}$ corresponding to different ${t_i}$ and I want to find a posterior distribution for the population mean (and variance) of the ${t_i}$. How do I do this? I assume hierarchical modelling comes into play, but I am having trouble applying this. For example, what further prior or priors do I need in relation to the ${t_i}$?

@Tim. Thanks for your answer. By the way, I have edited my original question to remove some confusion in the notation, but I will continue with the usage in your answer and use $mu_i$ for the Poisson parameter we are trying to estimate so that $mu_i=lambda{}t_i$. My follow up question, given your answer, is as follows. The conjugate $text{Gamma}(alpha{},beta{})$ can be considered as the probability distribution of the Poisson parameter given that $alpha{}$ counts are observed in $beta{}$ intervals. There is no uncertainty about the number of counts represented by $alpha{}$ as they are our data (assumed accurate) plus a pseudo-count for the prior. However, there is uncertainty about the intervals represented by the second gamma parameters, because instead of a number $beta{}$ of equal intervals, we have a number of unequal intervals, which we represent by making $beta{}$ a distribution $beta{}=text{Gamma}(c,d)$ as in your answer. The parameters $c$ and $d$ are then chosen to reflect our prior beliefs about the mean and variance of the lengths of these intervals. So it seems to be that a hyperprior on the second gamma parameter should be sufficient to reflect the uncertainty in the problem, without a hyperprior on the $alpha{}$ parameter. Is this reasonable and if so, is there a more rigorous way to justify it? Hopefully this would also support the intuition that it would be nice if as the number of intervals sampled $nrightarrow{}infty{}$, this hierarchical posterior distribution on approaches the Gamma distribution that you would get if all intervals were of an equal size which is itself equal to the mean size of the sample intervals. Such relationships would seem hard to demonstrate from a purely computational solution.

One Answer

From your description, it sounds like you had two kinds of random variables: time intervals $t_1,dots,t_n$ and the counts of events $y_1,dots,y_n$. The occurrence of events depends on length of time intervals given a known, constant rate $lambda$ according to Poisson distribution

$$ y_i sim mathcal{P}(lambda t_i) $$

This means that obviously both variables are correlated. You are interested in estimating the global mean of the process.

What you were considering to do, is to estimate the conditional distribution of $t_i mid y_i$. However, if you think about it, why wouldn't you simply look the marginal distribution of $t_i$'s..? The casual relation in this case is that $y_i$'s are caused by $t_i$'s (they are limited by the length of intervals), but the intervals are not influenced anyhow by the counts. If you grouped the lengths of intervals by the counts, would it provide any meaningful information?

I'd say, that for your purpose it is enough for you to look at the marginal distribution of $t_i$'s and model it using the most appropriate distribution, e.g. gamma (as you suggested),

$$ t_i sim mathcal{G}(alpha, beta) $$

Then the global expected value is

$$ E(lambda T) = lambda E(T) = lambda frac{alpha}{beta} $$

Answered by Tim on December 31, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP