Solved – Fixed effects in a cross-sectional data set

cross-sectionfixed-effects-modelleast squaresmultivariate analysisstata

I am conducting a research with a cross sectional data set (1 year, multiple countries). Now i am researching the likelihood of supportive leadership and firmsize. Now firm size is a catogorical variable (1-4), my teacher has advised me to create binary variables 0 or 1 for each category, excluding one (adding firm fixed effects), however i don't understand how to do this?
Should i add i.firmsize? Or is there a better way to this?

Thank you in advance!

Best Answer

Your teacher is giving you a correct suggestion.

For categorical predictors, one usually defines a dummy variable which encodes the fact that one observation belongs to one category or another. This is a general approach, so I'll show you a simple one fixed-effect model. Let's say that we have one categorical predictor $X$ that may assume 3 values $X \in \{X_1,X_2,X_3\}, \, i=1\ldots,N$. We want to fit a linear model between and a continuous random variable $Y$. Let's say that we have $N$ observations.
Then a linear model would be represented by the equation

$$ Y=X\beta+\epsilon $$

or

$$ y_i = \beta_0 + \beta_1 x_i + \epsilon, \quad i=1,\ldots,N $$

Now, since we have three possible categories for $X$, using the three possible categories doesn't make sense, because $X_1, X_2, X_3$ can be numbers or other categories and we don't know how to multiply categories by slope coefficients.
Instead, since we are interested in modelling the difference in average of $Y$ between the different categories, we can use a mathematical trick introducing the dummy variables.

In our case, since we have 3 possible categories, we set one as reference (this will be modelled as the intercept of the model) and the other two as variables shifted by one unit:

$$ X_{dummy} = \pmatrix {0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \vdots \\ 0 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 0 \\ \vdots \\ 0 & 0 & 1 \\ 0 & 0 & 1 \\ 0 & 0 & 1} $$

where the first column is always equal to 0, the second column is equal to 1 for the samples (rows) $x_i=X2$ and the third column is equal to 1 for the samples $x_i=X3$. With this encoding the model becomes this:

$$ y_i=\beta_0 + \beta_1 * x^{(dummy)}_{i,1} + \beta_2 * x^{(dummy)}_{i,2} + \beta_3 * x^{(dummy)}_{i,3} + \epsilon $$

which means that if $x_i=X1$, then $x^{(dummy)}_{i,1}=0, x^{(dummy)}_{i,2}=0, x^{(dummy)}_{i,3}=0$, giving the equation:

$$ y_i^{(X1)}=\beta_0 + \beta_1 * 0 + \beta_2 * 0 + \beta_3 * 0 + \epsilon = \beta_0 + \epsilon $$

if $x_i=X2$, then $x^{(dummy)}_{i,1}=0, x^{(dummy)}_{i,2}=1, x^{(dummy)}_{i,3}=0$, giving the equation:

$$ y_i^{(X2)}=\beta_0 + \beta_1 * 0 + \beta_2 * 1 + \beta_3 * 0 + \epsilon = \beta_0 + \beta_2 + \epsilon $$

and if $x_i=X3$, then $x^{(dummy)}_{i,1}=0, x^{(dummy)}_{i,2}=0, x^{(dummy)}_{i,3}=1$, giving the equation:

$$ y_i^{(X3)}=\beta_0 + \beta_1 * 0 + \beta_2 * 0 + \beta_3 * 1 + \epsilon = \beta_0 + \beta_3 + \epsilon $$

We can simplify by dropping $\beta_1$ (and renaming $\beta_2=\beta_1$, and $\beta_3=\beta_2$) because it's always equal to 0, getting a linear model

$$ y_i=\beta_0 + \beta_2 * x^{(dummy)}_{i,2} + \beta_3 * x^{(dummy)}_{i,3} + \epsilon $$

After seeing how the model encodes the 3 categories, it becomes easy to see how the parameters can be interpreted:

$$ y_i^{(X1)} = \beta_0 + \epsilon $$

the intercept $\beta_0$ represents the average $Y$ for the samples belonging to category $X_1$.

$$ y_i^{(X2)} = \beta_0 + \beta_1 + \epsilon $$

the first coefficient $\beta_1$ represents the average difference between the $Y$ of category $X_2$ and $X_1$ samples.
And finally,

$$ y_i^{(X3)} = \beta_0 + \beta_2 + \epsilon $$

the first coefficient $\beta_2$ represents the average difference between the $Y$ of category $X_3$ and $X_1$ samples.

Important: is this the only way to encode the categories into a linear model? No. There are other ways, each of them requiring an opportune change in the interpretation of the model parameters.

Practical aspects:

if you use R, this is automatically done by setting the categorical variable as a factor.

X <- sample(c("c1", "c2", "c3"), 20, replace=TRUE)
Y <- rnorm(20)

X <- factor(X)

model.matrix(Y~X, data.frame(X, Y))

gives you:

   (Intercept) Xc2 Xc3
1            1   1   0
2            1   0   0
3            1   0   1
4            1   0   0
5            1   0   0
6            1   0   0
7            1   1   0
8            1   0   0
9            1   0   0
10           1   1   0
11           1   0   1
12           1   1   0
13           1   0   1
14           1   0   1
15           1   1   0
16           1   0   1
17           1   1   0
18           1   0   1
19           1   1   0
20           1   0   0
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$X
[1] "contr.treatment"

with the intercept all equal to 1 for the formulation $Y=X\beta$, with $\beta=(\beta_0, \beta_1, \beta_2)$.

Related Question