Solved – Do dumthe variables count as independent variables when calculating degrees of freedom in a multiple regression

categorical datadegrees of freedommultiple regression

The degrees of freedom in a multiple regression equals N−k−1, where k is the number of variables.
Does k include dummy variables?

For example, I have the model:
Y=B1+B2D2+B3D3+B4D4+B5X1+B6X2+B7(D2D3)+B9(D2D4)+B9(D3D4)+B10(D2D3D4)+u
(there are 2 X variables and 3 dummies)

N=14780
To get the Degrees of Freedom would i use 14780-2-1 or 14780-5-1 or something else entirely?

Best Answer

In a multiple regression, $df$ is $N-k-1$, where $N$ is the sample size and $k$ is the number of variables. Why $N-k-1$ and not just $N-k$, then?

The $df$ is a measure of the number of parameters to be estimated. In the simplest case - $Y \sim X$, we estimate a regression of the form $Y=\hat{\beta}_0+\hat{\beta_1}X$. This model has two parameters - $\hat{\beta_0},\hat{\beta_0}$ - but one variable, so $k=1$ and the $df = N-k-1=N-2$.

When dealing with a qualitative variable with $\nu$ levels, we do that by creating $\nu-1$ dummies and estimating a parameter for each of them. As such, a qualitative variable with $\nu$ levels reduce the $df$ by $\nu-1$ - for dichotomous dummies, this reduces to 1 for each dummy.