I would say that since sample mean is $\overline{X}=\frac{\sum{X_i}}{n}$, its convergence rate is 1/n, which is also the convergence rate of sample variance. But the convergence rate of sample standard deviation $ 1/\sqrt{n} $.
I am somewhat pessimistic about a such non-parametric method, at least without the introduction of some sort of constraints on the underlying distribution.
My reasoning for this is that there will always be a distribution that breaks the true coverage probability for any finite $n$ (although as $n \rightarrow \infty$, this distribution will become more and more pathological), or the confidence interval will have to be arbitrarily large.
To illustrate, you could imagine a distribution that looks like a normal up to some value $\alpha$, but after $\alpha$ becomes extremely right skewed. This can have unbounded influence on the distribution's mean and as you push $\alpha$ out as far as possible, this can have arbitrarily small probability of making it into your sample. So you can imagine that for any $n$, you could pick an $\alpha$ to be so large that all points in your sample have extremely high probability of looking like it comes from a normal distribution with mean = 0, sd = 1, but you can also have any true mean.
So if you're looking for proper asymptotic coverage, of course this can be achieved by the CLT. However, your question implies that you are (quite reasonably) interested in the finite coverage. As my example shows, there will always be a pathological case that ruins any finite length CI.
Now, you still could have a non-parametric CI that achieves good finite coverage by adding constraints to your distribution. For example, the log-concave constraint is a non-parametric constraint. However, it seems inadequate for your problem, as log-normal is not log-concave.
Perhaps to help illustrate how difficult your problem could be, I've done unpublished work on a different constraint: inverse convex (if you click on my profile, I have a link to a personal page that has a preprint). This constraint includes most, but not all log-normals. You can also see that for this constraint, the tails can be "arbitrarily heavy", i.e. for any inverse convex distribution up to some $\alpha$, you can have heavy enough tails that the mean will be as large as you like.
Best Answer
First of all, I would check whether the mean is an appropriate index for the task at hand. If you are looking for "a typical/ or central value" of a skewed distribution, the mean might point you to a rather non-representative value. Consider the log-normal distribution:
The mean (red line) is rather far away from the bulk of the data. 20% trimmed mean (green) and median (blue) are closer to the "typical" value.
The results depend on the type of your "non-normal" distribution (a histogram of your actual data would be helpful). If it is not skewed, but has heavy tails, your CIs will be very wide.
In any case, I think that bootstrapping indeed is a good approach, as it also can give you asymmetrical CIs. The
R
packagesimpleboot
is a good start:... gives you following result: