I am currently running a multiple regression model using imputed data and have a few questions.
Background:
Using SPSS 18. My data appears to be MAR. Listwise deletion of cases leaves me with only 92 cases, multiple imputation leaves 153 cases for analysis. All assumptions met – one variable log transformed. 9 IV's 5 – 5 categorical, 3 scale, 1 interval. DV-scale. Using the enter method of standard multiple regression.
- My DV is the difference of scores between a pre- score and a post score measure, both of these variables are missing a number of cases – should I impute missing values for each of these and then work out the differnce between them to calculate my DV (how do I go about doing this), or can I just impute data for my DV? Which is the most appropriate approach?
- Should I run imputations on transformed data or skewed untransformed data?
- Should I enter all variables into the imputation process, even if they are not missing data, or should I just impute data for the variables missing more than 10% of cases?
I have run the regression on the listwise deleted cases and my IV's account for very little of the variance in my DV, subsequently I have run the regression on a complete file following multiple imputation – The results are very similar, in that my 9 IV's still predict only approx 12% of the variance in my DV, however, now one of my IV'S indicates that it is making a significant contribution (this happens to be a log transformed variable)…
- Should I report original data if there is little difference between my conclusions – i.e my IV's poorly predict the dv, or report the complete data?
Best Answer
References
Edwards, J. R. (1994). Regression analysis as an alternative to difference scores. Journal of Management, 20, 683-689.
Enders, C. K. (2010). Applied Missing Data Analysis. New York, NY: Guilford Press.