Solved – when to use log transformation for income

data transformationregression

Is it appropriate to use logs on a discrete measure of wealth where the different response options are not linear (i.e., they contain different wealth ranges like $\$500,000-\$749,999$ which is a smaller bucket than $\$1$ million-$\$1.9$ million)?

Is log only appropriate when you have the actual value of income (not just the category it falls in)?

Best Answer

The answer depends on your analysis. If your goal is to treat the bins as an ordinal variable, then there would be no point in transforming the data. However, if you wish to treat the variable as interval or ratio (perhaps you wish to use it as the dependent variable in regression), you could convert the variable into the mean of each range, and then log-transform. For example, an observation in the 500,000-749,999 range would be: $log$((500,000+749,000)/2). In that case, the log-transformation might help make the residuals more approximately normal, which is an assumption of regression.

Related Question