According to your updated question, the claim of @onestop is still valid: it's not ok to call them standard errors. Furthermore, the method seems strange and non-standard at all. What really was done in your case is to divide the population in two (values upper and lower than the mean) and calculate the standard error of THAT population, not of your real population and therefore, I find it personally strange to assign the length of the error bars in that way. Apparently the idea that was done here was taken from here. However, IMHO, the idea of dividing the sample and calculating an "upper and lower" standard deviation doesn't make much sense (or at least it botters me).
In physics (my area and apparently yours), however, it has been somewhat standard to show 68% confidence intervals for the sample median or the mean (depending on your choice of a location statistic; let's call this statistic $\bar{X}$ for the moment) in the following way for non-symmetric distributions (apparently emulating what would be a central credible interval): with your data points, you calculate $\bar{X}$ and then report an upper error bar of length $L_u$, where $L_u$ is calculated in order to satisfy $P(\bar{X}<\mu<\bar{X}+L_u)= 0.34$, where $\mu$ is the real (unknown) parameter. Then, for your lower error bar of length $L_l$, you repeat the same procedure but now downwards of the location statistic $\bar{X}$, i.e., $P(\bar{X}-L_l<\mu<\bar{X})= 0.34$. Of course, because the distribution of $\bar{X}$ is usually not known this is usually done with non-parametric methods (such as the Bootstrap or some variant of it).
As was also pointed out by @onestop, you can also obtain bayesian credible intervals, where you actually calculate the probability (density, in the continuous case) of your parameter given your data. Let's call this probability $p(x|D)$. The length of the lower error bar is now calculated in a more "natural way" (at least for me), in order to satisfy $P(\hat{x}-L_l<x<\hat{x}|D)=0.34$, and the length of the upper error bar is now calculated in order to satisfy $P(\hat{x}<x<\hat{x}+L_u|D)=0.34$, where $\hat{x}$ is your point estimate of the parameter (usually the median or even the mode).
All of the above, of course, makes sense only if your parameter is unimodal.
You are totally correct in your assumption that error bars representing the standard error of the mean are totally inappropriate for within-subject designs. However, the question of overlapping error bars and significance is yet another topic, to which I will come back at the end of this commented reference list.
There is rich literature from Psychology on within-subject confidence intervals or error bars which do exactly what you want. The reference work is clearly:
Loftus, G. R., & Masson, M. E. J. (1994). Using confidence intervals in within-subject designs. Psychonomic Bulletin & Review, 1(4), 476–490. doi:10.3758/BF03210951
However, their problem is that they use the same error term for all levels of a within-subject factor. This does not seem to be a huge problem for your case (2 levels). But there are more modern approaches solving this problem. Most notably:
Franz, V., & Loftus, G. (2012). Standard errors and confidence intervals in within-subjects designs: Generalizing Loftus and Masson (1994) and avoiding the biases of alternative accounts. Psychonomic Bulletin & Review, 1–10. doi:10.3758/s13423-012-0230-1
Baguley, T. (2011). Calculating and graphing within-subject confidence intervals for ANOVA. Behavior Research Methods. doi:10.3758/s13428-011-0123-7 [can be found here]
Further references can be found in the latter two papers (which I think are both worth a read).
How do researchers interpret CIs? Bad according to the following paper:
Belia, S., Fidler, F., Williams, J., & Cumming, G. (2005). Researchers Misunderstand Confidence Intervals and Standard Error Bars. Psychological Methods, 10(4), 389–396. doi:10.1037/1082-989X.10.4.389
How should we interpret overlapping and non-overlapping CIs?
Cumming, G., & Finch, S. (2005). Inference by Eye: Confidence Intervals and How to Read Pictures of Data. American Psychologist, 60(2), 170–180. doi:10.1037/0003-066X.60.2.170
One final thought (although this is not relevant to your case): If you have a split-plot design (i.e., within- and between-subject factors) in one plot, you can forget about error bars all together. I would (humbly) recommend my raw.means.plot
function in the R package plotrix
.
Best Answer
I assume the quantity you are looking at should never be less than zero. Thus, this indicates that with the limited amount of data you have your model is a bit problematic and predicts that values could be below zero. Potential ways of dealing with this could include transformation of the data for analysis. One obvious option is log-transformation (especially if you never observe an actual zero value). If you then back-transform the numbers in each group after analysis, you get a geometric mean (instead of the arithmetic mean) with CIs/error bars that would not overlap zero.