Solved – Continuous dependent variable with upper and lower bounds: logit transformation appropriate

continuous datadata transformationlogitresiduals

I'm analyzing the relationship between a (log-transformed) continuous independent variable and a continuous dependent variable that has a lower and upper bound. If I scale the dependent variable to values between 0 and 1, then take the logit, the relationship becomes linear, with seemingly homogenous variance.

Is it appropriate to then use ordinary least squares regression?
Can I judge the model fit by R-square, or can I use a goodness-of-fit test based on deviance?
Is it meaningful to look at deviance residuals to judge individual data points?

Best Answer

I was facing similar problems with probability of loss as my dependent variable (bounded to 0% and 100%), and I was about to use logit as smoothing function (to be unbounded) to then using OLS in estimating my independent macroeconomics parameters.

First, you have to ensure that the plot of transformed dependent variable is quite linearly scattered. Second, you need to prove that the error of response is normally distributed (otherwise the OLS estimator is suboptimal). Third, if your variance of error is heteroscedastic then you need some weighting technique to keep your OLS estimator is BLUE.

You will need to use another smoothing function if the first does not held. You will need to use maximum likelihood estimator if the second and the third are not held.

If I were you, I would take R-squared as goodness indicator as I am using OLS rather than MLE. And instead of deviance residual, maybe you could try to see Cook’s D in this manner.

Related Question