Solved – Mean has lower standard error than 5% trimmed mean

bootstrapmeanrobusttrimmed-mean

I'm investigating using a trimmed mean to measure the location of various distributions. The distributions sometimes are heavily contaminated and sometimes not. Usually they follow something similar to a log-normal or possibly mixed log-normal distribution, but often the data is "all over the place".

I've looked at the mean, 5% trimmed mean, 10% trimmed mean and 20% trimmed mean. For each I estimate the standard error using the bootstrap.

What I've found surprising though is that according to the bootstrap the mean often has a lower standard error than the 5% trimmed mean. So across a large number of datasets I have found that from the lowest standard error to the highest is
20% trimmed, 10% trimmed, mean, 5% trimmed.

Is this result atypical, or is it something that is commonly seen ? (Note that I am a beginner with respect to robust statistics and the bootstrap, so it is possible I'm making a fundamental conceptual mistake). Thanks for any hints.


Followup results: I reran the exercise but with much more data. In total there were around 4000 datasets I applied the bootstrap to. The results were as follows

technique      number of times lowest std error
mean           1867
5% trimmed     263
10% trimmed    430
20% trimmed    787
median         663

In this new data when the mean has the lowest standard error it is only better by a small amount, whereas when it does badly it performs really poorly. So when I look at the average standard error across all datasets for the different techniques the results are perhaps in line with what would be expected.

technique      avg std error
mean           4.51
5% trimmed     4.33
10% trimmed    4.05
20% trimmed    3.78
median         4.36

Best Answer

If the underlying population is normally distributed without contamination then the sample mean is the best unbiased estimate (in the sense of the lowest mean square error) of the centre of the population distribution.

This is not always the case with other distributions, which might include those with contamination. So your observation depends on the particular distribution and contamination.

Related Question