In my dataset there is a lot of multicollinearity between the independent variables, therefore it is not possible for me to include them all in the same regression (especially because I have a small sample size). However, my supervisor said that if I categorize some of the variables I might be able to use more independent variables in the same regression.
Why is this? Does it affect the degrees of freedom? Or what is the reason?
Best Answer
If you take continuous independent variables and slice them into categories, you don't save degrees of freedom. The more categories you split into, the more degrees of freedom you lose.
Worse, you bias your results - the fewer categories, the worse it is.
Numerous posts here on Cross Validated advise against this practice.