Solved – re-reflecting variable after transformation

back-transformationinterpretation

I have a question about the process of re-reflecting transformed data. I see many questions and answers around this but I am not sure any answer my specific question. Many of my variables were negatively skewed and so I reflected those variables before transforming them (either square root or log 10). I am looking to re-reflect the variables for interpretation purposes and as I go into my primary analyses. From what I have researched, my understanding is that I could subtract the maximum values for a given variable, plus 1, and it would "re-reflect" the variable to interpreting it in the original scales (e.g. high scores mean high scores).

However, my question is, when I go back and check to make sure the skewness has not changed, I found the M and SD have changed and I am wondering how to understand this, or if I am doing something wrong?

Best Answer

Despite a hint of numerous threads (although zero references; please give citations) this sounds like a bizarre procedure that may be popular only in certain sub-fields.

I can readily imagine that it's easier (e.g.) to work with $1 - x$ rather than $x$ if $x$ is a left-skewed fraction, but that doesn't seem to be the issue here.

If I understand this correctly, your transformations are of the form (assuming $x \ge 0$)

$\sqrt{x_{\text{max}} - x + 1}$

$\log_{10} ({x_{\text{max}} - x + 1}).$

It can be seen that those are reversible transformations, but they seem to me a bad idea. The words ad hoc apply everywhere.

If $x_{\text{max}}$ is empirical, you have no chance of comparability with other studies in the same branch of science. Either other people's $x_{\text{max}}$ is larger, in which case they can't use the same transformation, or it's smaller, in which case it doesn't have the same rationale, or they use their $x_{\text{max}}$, which is different, which leads to loss of comparability. It's poor science to use such arbitrary procedures, regardless of the statistical issues.
If $x_{\text{max}}$ is some definite upper limit, beyond which values are impossible, then there's a partial rationale you can make clear. An example would be that you can't go above all of a total. Odds are you have no such definite upper limit for most variables.
This all seems to be driven by the idea that skewness is a problem that must be solved. Where does that come from? For example, in a regression-like model, there is no requirement that predictor (regressor, explanatory, so-called "independent") variables have any particular marginal distribution or distributional properties. It might often be a good idea to transform because nonlinear relations are involved, but that's a different story.
Adding $1$ is manifestly a fudge so you can take logarithms. For that and other reasons the loss of simple functional form makes interpretation unnecessarily difficult. Back-transforming won't make the beast easier to think about.

If this is done to response (outcome, dependent) variables, please tell us why you think you are doing it. Odds are there's a better way.

P.S. Your concerns about mean (assuming that's what M indicates in your question) and SD are not clear, but in general for any nonlinear transformation, the mean and SD will not satisfy mean of transformed values $=$ transform of mean values, or equivalently for the SD. That's what nonlinearity implies, and the square root and logarithm of $x$, or the variants you have used, don't reduce to any $a + bx$.

UPDATE. Although it may be some distance from what you are doing, there is at least one very useful transformation that reverses sign, namely the reciprocal $1/x$. For example, the reciprocal of a time is often a rate or a kind of speed, and vice versa. So time to complete a task would be indeterminate if the task was not completed, but that maps to $0$ speed. Here if one scale is more useful than the other, then you should stick with that scale and not reverse the transformation unless it's essential substantively or practically.

Best Answer

Related Solutions

Solved – How to interpret a plot of trimming percentage vs. trimmed mean

Solved – Interpretation of log(1 + x) transformed predictor

Related Question