Solved – Test for differences in coefficients across groups in panel data

fixed-effects-modelpanel dataregression coefficientst-test

I am wondering how to test for differences in regression coefficients across groups in panel data (after a fixed-effects regression).

Particularly, I can't think of a solution of how to construct interaction terms if the groups you are interested in are not the same than the groups that you set your fixed-effects at.

Example:

You have two types of survey respondents (2 groups): adults and children. They answer questions about different sorts of ice-cream (fixed effects at ice-cream level).
The panel data set for a simple fixed-effects regression would look like this (all variables aggregated over adults and children):

   +----------------------------------------+
   | day | ice-cream | rating | cost  | size| 
   |-----+-----------+--------+-------+-----+
1. |  1  |  choco    |  1     |  4    |  3  | 
2. |  1  |  hazelnut |  9     |  2    |  1  | 
3. |  2  |  hazelnut |  3.5   |  1    |  1  | 
4. |  3  |  berry    |  3.5   |  5    |  2.5| 
5. |  3  |  vanilla  |  4     |  3    |  2  | 
etc |  . |  ......   |  .     |  .    |  ...| 
   +----------------------------------------+

How could you include interactions terms for adults and children in your regression?

I could run the regressions separately on two datasets (one for adults, one for children), but how could I test for significant differences then?

Thank you very much!

I asked a similar here, but maybe this was too detailed or not clear enough..

EDIT in response to @vinnief:
I want to outline how I understand @vinnief's suggestions by illustrating the potential structure of the resulting data set. I forgot to mention that the dependent variable (rating) is not derived from the respondents' answers (imagine a gourmet magazine which rated the ice-cream types. but still the rating varies over time).

The first suggestion was

making a new "individuals" index, which is ice-cream type*child/adult,
so separate the data into twice as many rows

   +----------------------------------------------------------------+
   | day | ice-cream_GROUP | rating | cost  | size|d_adult | d_child|
   |-----+-----------+--------+-------+-----+-----+-------+---------+
1. |  1  |  choco_adult    |  1     |  2    |  3  |  1    |    0    |
2. |  1  |  hazelnut_adult |  9     |  3    |  4  |  1    |    0    |
3. |  1  |  hazelnut_child |  9     |  4    |  1.4|  0    |    1    |
4. |  3  |  choco_adult    |  3.5   |  5    |  1.5|  1    |    0    |
5. |  3  |  choco_child    |  3.5   |  3    |  2  |  0    |    1    |
etc|  .  |  ......         |  .     |  .    |  ...|  1    |    0    |
   +------------------------------------------------------+---------+

Is this what you mean? I don't understand how I would need to set up the fixed effects regression in order to test for differences in coefficients for adults versus children?

The second suggestion was

what if you leave the index as is, but separate child and adult
ratings into separate columns? Would that help you?

   +------------------------------------------------------------------+-----------+
   | day | ice-cream | rating | cost_adult  | cost_child | size_adult | size_child|
   |-----+-----------+--------+-------+-----+------------+------------+-----------+
1. |  1  |  choco    |  1     |     2       |     1      |      0      |     0    |
2. |  1  |  hazelnut |  9     |     3       |     4      |      4      |     1.4  |
3. |  3  |  choco    |  1     |     5       |     3      |      1.5    |     2    |
etc|  .  |  ......   |  .     |     .       |     .      |      .      |     .    |
   +------------------------------------------------------------------------------+

Are you suggesting to run a fixed-effects regression with all independent variables (cost_adult,cost_child,size_adult,size_child)? How would I test for differences in coefficients between adults and children then? Again, a big thank you for helping me!

Best Answer

A simple but probably imperfect option for separate datasets (or even ones you don't actually have, if you at least have the necessary sample statistics) would be to construct confidence intervals and determine what level of confidence you can achieve about your coefficient estimates before the confidence intervals overlap. I've used this method once myself. It ought to tell you roughly equivalent information to that of a more formal significance test you might wish to use instead (I've found such a test elusive, personally, as has a colleague of mine with lots of experience in regression).

I think @DLDahly answered your question about interaction terms fairly well in his comment on your last post (which one might argue you've mostly duplicated here, which isn't accepted practice). Why not dummy-code your two age groups (if they're each homogeneous enough and not dichotomized spuriously), use this binary variable and its product with the other variable (assuming you're coding it right too, if it's nominal data), and investigate the interaction term's contribution to the model that way?

This page gives an example of dummy-coded variables and interactions toward the bottom. If both of your variables are nominal, and you have repeated measures from certain subjects, you might consider dropping the regression approach and using a mixed-design ANOVA too.

Related Question