Can we estimate the mean of an asymmetric distribution in an unbiased and robust manner?

Cross Validated Asked on December 21, 2020

Suppose I have i.i.d. samples $X_1, cdots, X_n$ from some unknown distribution $F$ and I wish to estimate the mean $mu=mu(F)$ of that distribution and I insist that the estimator be unbiased – i.e., $mathbb{E}[T(X_1, cdots, X_n)] = mu$.

The canonical estimator is the sample mean $overline{X} = frac{1}{n} sum_{i=1}^n X_i$. This is always unbiased and for many families of distributions, such as Gaussians, it is optimal or near-optimal in terms of variance.

However, the sample mean is not robust. In particular, the sample mean can change arbitrarily if a single $X_i$ is changed. This means it has a breakdown point of 0.

A more robust estimator is the sample median. Changing a few data points will not, for most samples, significantly change the median. This has a breakdown point of 0.5, which is the highest possible.

For Gaussian data, the sample median has higher variance than the sample mean (by a factor of $pi/2$). However, for other distributions, such as the Laplace distribution or Student’s $t$-distribution, the median actually has lower variance than the mean.

Furthermore, the median is always unbiased if the distribution is symmetric (about its mean). Many natural distributions are symmetric, but many are not, such as the following examples.

  1. Binomial
  2. Poisson
  3. log-Normal
  4. Gamma
  5. F-distribution
  6. Geometric distribution

My question is: Are there robust and unbiased estimators for the means of natural asymmetric distributions? By robust I simply mean a non-zero breakdown point and by natural I mean something from the above list or similar (just not a concocted example). I can’t find any examples. I would be particularly interested in the Binomial case.

2 Answers

This is not an unbiased estimate, but it is consistent (you can let the bias approach to zero as the sample size grows).

You can take a trimmed sample (remove the highest and lowest values) and use the mean of the trimmed sample as the estimate.

In the case of a know distribution then you might use an appropriate scaling to make the estimate less biased (or not biased at all), or otherwise the bias will just decrease when you take smaller samples.

Answered by Sextus Empiricus on December 21, 2020

As already said by whuber, one way to answer your question is to de-biase your estimator. If the robust estimator is biased, maybe you can subtract the theoretical bias (according to a parametric model), there are some work that try to do that or to subtract an approximation of the bias (I don't remember a ref but I could search for it if you are interested). For instance, think about the empirical median in an exponential model. We can compute its expectation and then substract this expectation, if you want I can make the computations this is rather simple ... this becomes more difficult if the estimator is more complicated than the median and this works only in parametric models.

A maybe less ambitious question is whether we can construct a consistent robust estimator. This we can do but we have to be careful of what we call robust.

If your definition of robust is having a non-zero asymptotic breakdown point, then already we can prove that this is impossible. Suppose that your estimator is called $T_n$ and it converges to $mathbb{E}[X]$. $T_n$ has a non-zero breakdown point which means that there can be a portion $varepsilon>0$ of the data arbitrarily bad and nonetheless $T_n$ will not be arbitrarily large. But this can't be because at the limit, if a portion of the data is an outlier, this translates: with probability $1-varepsilon$, $X$ is sampled from the target distribution $P$ and with probability $varepsilon$ $X$ is arbitrary, but this makes $mathbb{E}[X]$ arbitrary also (if you want me to put it formally, I can) which is in contradiction with the non-asymptotic breakdown point of $T_n$.

Finally, to conclude on this, we can take the non-asymptotic point of view. Saying that we don't care about the asymptotic breakdown point, what is important is either a the non-asymptotic breakdown point (something like a breakdown point of $1/sqrt{n}$. Or to be efficient on heavy-tailed data.

In this case, there are estimators that are robust and consistent estimators of $mathbb{E}[X]$. For instance, we can use Huber's estimator with a parameter that goes to infinity or we can use the median-of-means estimator with a number of blocks that tends to infinity. References for this line of thought are "Challenging the empirical mean and empirical variance: A deviation study" by Olivier Catoni or "Sub-Gaussian mean estimators" by Devroye et al (these ref are in the theoretical community, they may be complicated if you are not familiar with empirical processes and concentration inequalities).

Answered by TMat on December 21, 2020

Add your own answers!

Related Questions

Cross validation and parameter tuning

5  Asked on November 20, 2020 by sana-sudheer


How does the Dyna Q algorithm works?

1  Asked on November 19, 2020 by nolw38


ReLU outperforming Softplus

1  Asked on November 12, 2020 by mike-land


Individual sampling weights and percentages

1  Asked on November 6, 2020 by seth-c


Bayesian Likelihood function range

1  Asked on October 29, 2020 by shamm


Zero inflated continuous outcome variables

0  Asked on October 26, 2020 by michaelkyei


How to test paired observations

1  Asked on October 23, 2020 by doug-fir


PCA loadings of correlation matrix

0  Asked on October 20, 2020 by mri


Ask a Question

Get help from others!

© 2022 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP