# Solved – Fixed effects in a cross-sectional data set

cross-sectionfixed-effects-modelleast squaresmultivariate analysisstata

I am conducting a research with a cross sectional data set (1 year, multiple countries). Now i am researching the likelihood of supportive leadership and firmsize. Now firm size is a catogorical variable (1-4), my teacher has advised me to create binary variables 0 or 1 for each category, excluding one (adding firm fixed effects), however i don't understand how to do this?
Should i add i.firmsize? Or is there a better way to this?

Your teacher is giving you a correct suggestion.

For categorical predictors, one usually defines a dummy variable which encodes the fact that one observation belongs to one category or another. This is a general approach, so I'll show you a simple one fixed-effect model. Let's say that we have one categorical predictor $$X$$ that may assume 3 values $$X \in \{X_1,X_2,X_3\}, \, i=1\ldots,N$$. We want to fit a linear model between and a continuous random variable $$Y$$. Let's say that we have $$N$$ observations.
Then a linear model would be represented by the equation

$$Y=X\beta+\epsilon$$

or

$$y_i = \beta_0 + \beta_1 x_i + \epsilon, \quad i=1,\ldots,N$$

Now, since we have three possible categories for $$X$$, using the three possible categories doesn't make sense, because $$X_1, X_2, X_3$$ can be numbers or other categories and we don't know how to multiply categories by slope coefficients.
Instead, since we are interested in modelling the difference in average of $$Y$$ between the different categories, we can use a mathematical trick introducing the dummy variables.

In our case, since we have 3 possible categories, we set one as reference (this will be modelled as the intercept of the model) and the other two as variables shifted by one unit:

$$X_{dummy} = \pmatrix {0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \vdots \\ 0 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 1 & 0 \\ \vdots \\ 0 & 0 & 1 \\ 0 & 0 & 1 \\ 0 & 0 & 1}$$

where the first column is always equal to 0, the second column is equal to 1 for the samples (rows) $$x_i=X2$$ and the third column is equal to 1 for the samples $$x_i=X3$$. With this encoding the model becomes this:

$$y_i=\beta_0 + \beta_1 * x^{(dummy)}_{i,1} + \beta_2 * x^{(dummy)}_{i,2} + \beta_3 * x^{(dummy)}_{i,3} + \epsilon$$

which means that if $$x_i=X1$$, then $$x^{(dummy)}_{i,1}=0, x^{(dummy)}_{i,2}=0, x^{(dummy)}_{i,3}=0$$, giving the equation:

$$y_i^{(X1)}=\beta_0 + \beta_1 * 0 + \beta_2 * 0 + \beta_3 * 0 + \epsilon = \beta_0 + \epsilon$$

if $$x_i=X2$$, then $$x^{(dummy)}_{i,1}=0, x^{(dummy)}_{i,2}=1, x^{(dummy)}_{i,3}=0$$, giving the equation:

$$y_i^{(X2)}=\beta_0 + \beta_1 * 0 + \beta_2 * 1 + \beta_3 * 0 + \epsilon = \beta_0 + \beta_2 + \epsilon$$

and if $$x_i=X3$$, then $$x^{(dummy)}_{i,1}=0, x^{(dummy)}_{i,2}=0, x^{(dummy)}_{i,3}=1$$, giving the equation:

$$y_i^{(X3)}=\beta_0 + \beta_1 * 0 + \beta_2 * 0 + \beta_3 * 1 + \epsilon = \beta_0 + \beta_3 + \epsilon$$

We can simplify by dropping $$\beta_1$$ (and renaming $$\beta_2=\beta_1$$, and $$\beta_3=\beta_2$$) because it's always equal to 0, getting a linear model

$$y_i=\beta_0 + \beta_2 * x^{(dummy)}_{i,2} + \beta_3 * x^{(dummy)}_{i,3} + \epsilon$$

After seeing how the model encodes the 3 categories, it becomes easy to see how the parameters can be interpreted:

$$y_i^{(X1)} = \beta_0 + \epsilon$$

the intercept $$\beta_0$$ represents the average $$Y$$ for the samples belonging to category $$X_1$$.

$$y_i^{(X2)} = \beta_0 + \beta_1 + \epsilon$$

the first coefficient $$\beta_1$$ represents the average difference between the $$Y$$ of category $$X_2$$ and $$X_1$$ samples.
And finally,

$$y_i^{(X3)} = \beta_0 + \beta_2 + \epsilon$$

the first coefficient $$\beta_2$$ represents the average difference between the $$Y$$ of category $$X_3$$ and $$X_1$$ samples.

Important: is this the only way to encode the categories into a linear model? No. There are other ways, each of them requiring an opportune change in the interpretation of the model parameters.

Practical aspects:

if you use R, this is automatically done by setting the categorical variable as a factor.

X <- sample(c("c1", "c2", "c3"), 20, replace=TRUE)
Y <- rnorm(20)

X <- factor(X)

model.matrix(Y~X, data.frame(X, Y))


gives you:

   (Intercept) Xc2 Xc3
1            1   1   0
2            1   0   0
3            1   0   1
4            1   0   0
5            1   0   0
6            1   0   0
7            1   1   0
8            1   0   0
9            1   0   0
10           1   1   0
11           1   0   1
12           1   1   0
13           1   0   1
14           1   0   1
15           1   1   0
16           1   0   1
17           1   1   0
18           1   0   1
19           1   1   0
20           1   0   0
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")\$X
[1] "contr.treatment"


with the intercept all equal to 1 for the formulation $$Y=X\beta$$, with $$\beta=(\beta_0, \beta_1, \beta_2)$$.