Solved – Fixed effects at industry & year level for firm-level data

fixed-effects-modelmultiple regression

Various papers that study firm-level effects include dummy variables at the industry & year level.

From what I understand, calculating fixed effects requires panel data, i.e. (for firm-level data) unique firm-to-year matches.

But unless there is exactly one firm per industry, there can be no unique matches for the fixed effects we are looking at, i.e. no unique industry-to-year matches.

My questions are:

(1) Is the common way to deal with this to obtain the average (of the dependent variable) across all firms per industry for each year? This average could then be included as a regressor.

(2) Or should I obtain de-meaned values of the dependent variable per industry and year, and include this as a regressor. I.e., for each observation, create one new variable that contains this observation minus the industry mean (across all years in the sample) and one that contains the observation minus the year mean (across all industries in the sample)?

Or (3) should I just include industry and year dummy variables instead?

Best Answer

As Jesper Hybel has mentioned in his comment, you can go with option (3). The least-squares dummy variable (LSDV) estimator and the de-meaning approach (1) give you the same results. LSDV has the advantage that it easily allows you easily extract the values of the fixed-effects (if you are not only interested to control for industry and year heterogeneity but also want to see, for example, which industries have particularly large values of the dependent variable).

There are some (theoretical) issues with the LSDV that it might be inconsistent if the number of industries or years go to infinity (this is called the incidental parameter problem; the issue is that you would have to estimate an infinite number of parameters or dummies in that case). This is not problem with the demeaning approach. For applied researchers this is in my view however not a problem.

There are excellent answers regarding the equivalence of the least-squares dummy variable estimator and the de-meaning approach, see here or here

(2) is a related but somewhat different approach called Mundlak approach (there are similar alternatives with other names) which allows you to estimate the effect of, for example, time-invariant variables which would cancel out in the LSDV or the de-meaning fixed-effects approach.

Related Question