# Can we estimate the mean of an asymmetric distribution in an unbiased and robust manner?

Cross Validated Asked on December 21, 2020

Suppose I have i.i.d. samples $$X_1, cdots, X_n$$ from some unknown distribution $$F$$ and I wish to estimate the mean $$mu=mu(F)$$ of that distribution and I insist that the estimator be unbiased – i.e., $$mathbb{E}[T(X_1, cdots, X_n)] = mu$$.

The canonical estimator is the sample mean $$overline{X} = frac{1}{n} sum_{i=1}^n X_i$$. This is always unbiased and for many families of distributions, such as Gaussians, it is optimal or near-optimal in terms of variance.

However, the sample mean is not robust. In particular, the sample mean can change arbitrarily if a single $$X_i$$ is changed. This means it has a breakdown point of 0.

A more robust estimator is the sample median. Changing a few data points will not, for most samples, significantly change the median. This has a breakdown point of 0.5, which is the highest possible.

For Gaussian data, the sample median has higher variance than the sample mean (by a factor of $$pi/2$$). However, for other distributions, such as the Laplace distribution or Student’s $$t$$-distribution, the median actually has lower variance than the mean.

Furthermore, the median is always unbiased if the distribution is symmetric (about its mean). Many natural distributions are symmetric, but many are not, such as the following examples.

My question is: Are there robust and unbiased estimators for the means of natural asymmetric distributions? By robust I simply mean a non-zero breakdown point and by natural I mean something from the above list or similar (just not a concocted example). I can’t find any examples. I would be particularly interested in the Binomial case.

This is not an unbiased estimate, but it is consistent (you can let the bias approach to zero as the sample size grows).

You can take a trimmed sample (remove the highest and lowest values) and use the mean of the trimmed sample as the estimate.

In the case of a know distribution then you might use an appropriate scaling to make the estimate less biased (or not biased at all), or otherwise the bias will just decrease when you take smaller samples.

Answered by Sextus Empiricus on December 21, 2020

As already said by whuber, one way to answer your question is to de-biase your estimator. If the robust estimator is biased, maybe you can subtract the theoretical bias (according to a parametric model), there are some work that try to do that or to subtract an approximation of the bias (I don't remember a ref but I could search for it if you are interested). For instance, think about the empirical median in an exponential model. We can compute its expectation and then substract this expectation, if you want I can make the computations this is rather simple ... this becomes more difficult if the estimator is more complicated than the median and this works only in parametric models.

A maybe less ambitious question is whether we can construct a consistent robust estimator. This we can do but we have to be careful of what we call robust.

If your definition of robust is having a non-zero asymptotic breakdown point, then already we can prove that this is impossible. Suppose that your estimator is called $$T_n$$ and it converges to $$mathbb{E}[X]$$. $$T_n$$ has a non-zero breakdown point which means that there can be a portion $$varepsilon>0$$ of the data arbitrarily bad and nonetheless $$T_n$$ will not be arbitrarily large. But this can't be because at the limit, if a portion of the data is an outlier, this translates: with probability $$1-varepsilon$$, $$X$$ is sampled from the target distribution $$P$$ and with probability $$varepsilon$$ $$X$$ is arbitrary, but this makes $$mathbb{E}[X]$$ arbitrary also (if you want me to put it formally, I can) which is in contradiction with the non-asymptotic breakdown point of $$T_n$$.

Finally, to conclude on this, we can take the non-asymptotic point of view. Saying that we don't care about the asymptotic breakdown point, what is important is either a the non-asymptotic breakdown point (something like a breakdown point of $$1/sqrt{n}$$. Or to be efficient on heavy-tailed data.

In this case, there are estimators that are robust and consistent estimators of $$mathbb{E}[X]$$. For instance, we can use Huber's estimator with a parameter that goes to infinity or we can use the median-of-means estimator with a number of blocks that tends to infinity. References for this line of thought are "Challenging the empirical mean and empirical variance: A deviation study" by Olivier Catoni or "Sub-Gaussian mean estimators" by Devroye et al (these ref are in the theoretical community, they may be complicated if you are not familiar with empirical processes and concentration inequalities).

Answered by TMat on December 21, 2020

## Related Questions

### Should I balance the classifier train/test set, if metrics is Precision/Recall (F1 score)?

1  Asked on November 2, 2021 by data-man

### Assumption logistic regression: linearity of independent variables and log odds?

0  Asked on November 2, 2021 by franziska

### lme4: Three-Level Autoregressive Model – Random Effects

1  Asked on November 2, 2021 by s_haring

### How many ways are there to select exactly one heart in a hand of 5?

3  Asked on March 9, 2021 by pythonnoob

### The behaviour of dice loss when target and prediction are disjoint

0  Asked on March 4, 2021 by bmurray

### Loss function for regression

1  Asked on March 3, 2021

### What is a generalized linear model

0  Asked on March 2, 2021 by pluviophile

### Help with choosing appropriate way to test hypothesis

1  Asked on March 2, 2021 by sleepy

### Identifiability of multinomial logistic regression

0  Asked on March 1, 2021 by sedi

### How to calculate the ACF and PACF for time series

2  Asked on February 28, 2021 by peterbe

### Represent Integer Categorical feature as both Numeric and Categorical

0  Asked on February 27, 2021 by user2991421

### Determine the test statistic for each case

1  Asked on February 27, 2021 by mathslover

### Statistical test whether to use Sharp or Fuzzy Regression Discontinuity Design

1  Asked on February 27, 2021 by misologie

### Conditional and unconditional expectation for the variance of error term in linear regression

1  Asked on February 25, 2021 by mcgurck

### Python Fastai library – Loss and Validation interpretation

0  Asked on February 25, 2021 by la_haine

### How to estimate Standard error with delta method

0  Asked on February 24, 2021 by zge

0  Asked on February 24, 2021 by diricksen

### SVM-Light displays corrupted precision/recall results

1  Asked on February 24, 2021 by zvisofer

### bounds test for cointegration (Pesaran ardl)

2  Asked on February 23, 2021 by user54285