Will log transformation always mitigate heteroskedasticity? Because the textbook states that log transformation often reduces the heteroskedasticity. So, I want to know in which cases it won't lessen heteroskedasticity.
Solved – Will log transformation always mitigate heteroskedasticity
data transformationheteroscedasticitylogarithmregression
Related Solutions
Actually, I'd say just the opposite. Multicolinearity is often scoffed at as a concern. The only time this is a real issue is when one variable can be written as an exact linear function of others in the model (a male dummy variable would be exactly equal to a constant/intercept term minus a female dummy variable; hence, you can't have all three in your model). A prime example is Goldberger's comparison to "micronumerousity."
Perfect multicolinearity means that your model cannot be estimated; (not perfect) multicolinearity often leads to large standard errors, but no bias or real problems; heteroskedasticity means that your standard errors are incorrect and your estimates are inefficient.
First, I would create a model that yields the parameter estimates as I want to interpret them (level change, percent change, etc.) by using logs as appropriate. Then, I would test for heteroskedasticity. The most accepted option is to simply use robust standard errors to give you correct standard errors, but for inefficient parameter estimates. Alternatively, you can use weighted least squares to get efficient estimates, but this has become less common unless you know the relationship between the variances of your observations (they each depend upon the size of the observation---like population of a country). Indeed, in cross section econometrics using a data set of any real size, robust standard errors have become required irrespective of the outcome of a BP test; they are applied almost automatically.
There isn't a good test for endogeneity. You're real problem is that the regressor is correlated with the error; OLS will force the regressor to be uncorrelated with the residual. So you won't find any correlation there. Endogeneity is what makes econometrics hard and is a whole topic unto itself.
There are too many questions asked. You are welcome to break it down. And many of the questions are already answered well in this forum.
I will only address your first question here.
There's more variables, and most of them are heavily skewed. (mostly right or some left)
It is seems you may have some mis-understandings on linear regression assumptions. Linear regression does not assume independent variable / model input to be Gaussian distributed, but assume the residual.
Details can be found
In the first link I provided, it also explains normality of residuals is not that important as you may think.
For feature selections see here
Best Answer
No; sometimes it will make it worse.
Heteroskedasticity where the spread is close to proportional to the conditional mean will tend to be improved by taking log(y), but if it's not increasing with the mean at close to that rate (or more), then the heteroskedasticity will often be made worse by that transformation.
Because taking logs "pulls in" more extreme values on the right (high values), while values at the far left (low values) tend to get stretched back:
this means spreads will become smaller if the values are large but may become stretched if the values are already small.
If you know the approximate form of the heteroskedasticity, then you can sometimes work out a transformation that will approximately make the variance constant. This is known as a variance-stabilizing transformation; it is a standard topic in mathematical statistics. There are a number of posts on our site that relate to variance-stabilizing transformations.
If the spread is proportional to the square root of the mean (variance proportional to the mean), then a square root transformation - the variance-stabilizing transformation for that case - will tend to do much better than a log transformation; the log transformation does "too much" in that case. In the second plot we have the spread decrease as the mean increased, and then taking either logs or square roots would make it worse. (It turns out that the 1.5 power actually does reasonably well at stabilizing variance in that case.)