Confidence Interval – Best Way to Find Confidence Interval for a Mean of a Symmetric Leptokurtic Distribution

confidence intervalkurtosis

I have a very symmetric distribution with kurtosis of 10 and sample size of more than 100.
Here is the Histogram https://ibb.co/ws7vBjd

This histogram was obtained by asking participants in the control group to rate a difference in sharpness of two identical X-Ray images, using a continuous scale from -50 to 50 (including fractional values). A great majority of participants didn't see any difference and gave a score of 0 for the difference in sharpness (this is the largest center bin), but a few did say that they see a small difference (these are the smaller bins).

The same question was asked of the test group where two images did differ in sharpness, and the distribution gained from the test group was normal.

  1. What is the simplest way (least complicated) to find a confidence interval for the mean of such a leptokurtic distribution?

Best Answer

@Georg (+1) has given one possible answer. But you may not consider his answer as the 'simplest' possibility.

Consider the fictitious data below, 1000 observations which come from a symmetric, leptokurtic distribution.

summary(x)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  42.33   49.30   49.99   50.04   50.77   57.67 
hist(x, prob=T, col="skyblue2")

enter image description here

Suppose we do not know the mean or distribution of the population. Because the sample seems roughly symmetric, we could assume the population mean and median are the same. Then the confidence interval $(49.97, 50.07)$ from the Wilcoxon rank sum test in R, might be considered a 95% CI for $\mu.$

wilcox.test(x, conf.int=T)$conf.int
[1] 49.96674 50.07208
attr(,"conf.level")
[1] 0.95

Also, a reasonably simple 95% nonparametric bootstrap CI $(49.95, 50.12)$ for $\mu,$ nearly the same as above, can be obtained as shown below. If you know about bootstrap confidence intervals, you may consider this method sufficiently simple for your purposes.

a.obs = mean(x)
d = replicate(2000, mean(sample(x,1000,rep=T))-a.obs)
LU = quantile(d, c(.975,.025))
a.obs-LU 
   97.5%     2.5% 
49.95388 50.12392 

Note: The fictitious sample x used in the illustrations above was sampled from a Laplace (double exponential) distribution with mean $\mu = 50,$ using R:

set.seed(2022)
y = rexp(1000);  b = sample(c(-1,1), 1000, rep=T)
x = y*b + 50