It is not valid in general because the deduction of the distribution of the t-statistic depends on the assumption of normality. In addtion, the interpretration of testing the hypothesis of equality of means in the half-normal case is not clear because the mean depends on the parameters mu and sigma. Therefore if the variance of the populations differs (even slightly), you will reject the hypothesis more than you should even if the parameters mu1=mu2 (this is, you are not getting the desired Type I error level). Take a look at this code so you can verify it
rm(list=ls())
power = function(n,ns,mu1,mu2,s1,s2){
count = rep(0,n)
for(i in 1:n){
x = mu1 + abs(rnorm(ns,0,s1))
y = mu2 + abs(rnorm(ns,0,s2))
if(t.test(x,y,paired=TRUE)$p.value>0.05) count[i] = 1
}
return(1-mean(count))
}
power(10000,30,1,1,1,1.15)
power(10000,100,1,1,1,1.15)
power(10000,30,1,1.1,1,1)
Perhaps it would be helpful to plot the profile likelihood of the parameters of each population and take a look at the similarities between them. Also you could use a likelihood ratio test instead.
I hope this helps.
Best wishes.
You may want to take a look at this question: Is normality testing 'essentially useless'? Answers discuss the Shapiro-Wilk test, particularly the accepted answer, which includes a simulation test.
Your problem may be different than most if you're not concerned with the distribution for the sake of meeting another planned analysis' assumptions though. Fitting a normal distribution to your data may only prompt you to ignore its peculiarities if they're small enough. If there isn't another analysis you need to perform that assumes normality, rather than trying to fit a known distribution to yours, you might consider describing your distribution in terms of its skewness and kurtosis, and add confidence intervals if you like (but consider relevant precautions in doing so).
Best Answer
It is common to try to apply some kind of transformation to normality (using e.g. logarithms, square roots, ...) when encountered with data that isn't normal. While the logarithm yields good results for skewed data reasonably often, there is no guarantee that it will work in this particular case. One should also bear @whubers comment above in mind when analysing transformed data: "A t-test for the logarithms is neither the same as a t-test for the untransformed data nor a nonparametric test. The t-test on the logs compares geometric means, not the (usual) arithmetic means."
Transformations to normality should always be followed by an investigation of the normality assumption, to assess whether the transformed data looks "normal enough". This can be done using for instance histograms, QQ-plots and tests for normality. The t-test is particularly sensitive to deviations from normality in form of skewness and therefore a test for normality that is directed towards skew alternatives would be preferable. Pearson's sample skewness $\frac{n^{-1}\sum_{i=1}^n(x_i-\bar{x})^3}{(n^{-1}\sum_{i=1}^n(x_i-\bar{x})^2)^{3/2}}$ is a suitable test statistic in this case.
Rather than choosing a transformation (such as logarithms) because it works most of the time, I prefer to use the Box-Cox procedure for choosing a transformation using the given data. There are however some philosophical issues with this; in particular whether this should affect the number of degrees of freedom in the t-test, since we've used some information from the sample when choosing which transform to use.
Finally, a good alternative to using either the t-test after a transformation or a classical nonparametric test is to use the bootstrap analogue of the t-test. It does not require the assumption of normality and is a test about the untransformed means (and not about anything else).