Interpreting Log-transformed Data – Normal Distribution and T-test Insights

data transformationlognormal distributionnormal distributiont-test

I have data that is not normally distributed. I can log-transform it to be normally distributed, and then perform, for example, a t-test.

But how do I interpret the results of the t-test?

Do I have to transform back the p-value, before I can say that the difference between the two means is significant? Anything else I need to do? Or can I just take the results and interpret them as if the test was performed on the original data? I'm confused about this.

The data comes from a device that measures force in Newton and is ratio scaled.

Best Answer

If the logs of the data are really drawn from normally distributed populations with constant variance (but possibly different means), then the original data must have come from lognormal distributions with possibly different scales (due to differences in $\mu$, where adding something to the means on the log-scale has a multiplying effect on the original scale), and hence those populations will also differ in mean.

So a difference in population mean on the log scale will (if the other assumptions of the usual two-sample equal-variance t-test apply) imply a change in mean on the original (untransformed) scale.

enter image description here

You can convert the estimated size of difference on the log scale to a percentage increase on the original scale, and the ends of a confidence interval carry back as well.

So imagine you have an estimated difference in means of about 0.7 ($\hat{\delta}=\hat{\mu}_1-\hat{\mu}_2=0.7$) and (say) a 90% confidence interval for the difference in means of $(0.62, 0.78)$. then the ratio of means of the original populations is estimated to be $\exp(0.7)\approx 2.014$ (i.e. the first group has a mean about twice the size of the second group), and the 90% CI for the ratio of means would be $(e^{0.62}, e^{0.78})\approx (1.86,2.18)$.

Note that p-values carry over directly; we're still performing inference on the parameter $\mu$ (the mean on the log scale).

Related Question