Solved – When should I use contrast coding

categorical datacategorical-encodingcontrastsinteractionstata

I have a – so I guess – a simple question: I am using Stata 13 and I am running a Tobit model to understand differences in firm performance. Among others, I am controling for firm types $T_i$- i.e. Single-Owner-Firms $SOF$vs. Multiple-Owner-Firms $MOF$.

So far I dummied $T_i$ so that $MOF=0$ and $SOF=1$. Here is why:

Given a simple relationship such as

$y_i= a + \beta_1x_i + \beta_2T_i +\beta_3x_iT_i$

We can show that

$y_i= (a + \beta_1x_i) + T_i(\beta_2+\beta_3x_i)$

The lower order coefficients $\beta_1$ shows the simple effect of $x_i$ on $y_i$ for $T_i=0$ since $y_i= (a + \beta_1x_i)$ for $T_i=0$. Similar, $\beta_2$ shows the simple effect of $T_i$ on $y_i$ since $y_i= a + T_i\beta_2$ for $x_i=0$.

Finally, and as far as I understand it, $\beta_3$ depicts how the slopes differ the firm types $T_i=0$ and $T_i=1$

However, I now got the advice to contrast code $T_i$ so that $MOF=-1$ and $SOF=1$.

I have three questions:

  1. Is the interpretation scheme I put forward above correct?
  2. Why and when should one use dummy and contrast (effect) coding?
  3. And if I use contrast coding, how would I interpret thet resulting coefficients?

Best Answer

  1. Yes, with the caveat that Tobit models have multiple marginal effects and it is not clear which ones you care about here.
  2. Contrast coding is typically used when you have a categorical variable with more than two levels (something like education: dropout, high school, college, and graduate) and you are interested in comparing the marginal effects to each other, not just to the omitted base level. However, with Stata's margins, contrast command, this is less useful than it once was. Since you only have two levels, contrast coding does not seem very useful here.
Related Question