I am performing my research on the most active 50 companies for a period of 5 years. The most active 50 companies change every year. I want to determine how my dependent variable is impacted by the explanatory variables without regarding the time and the companies. So can I just pool all the data as one group and run my regression.
Solved – Pooled data in regression analysis
poolingregression
Best Answer
Pooled OLS regressions in the case of Panel Data are usually frowned upon.First, if the conditional exogeneity condition holds, that is: $$ E[(\alpha_{i}+u_{it}|X_{it})]=0 $$ holds, then you might as well use a random effects estimator (In the above expression, $\alpha_{i}$is the time invariant, individual specific nuisance paramter and $u_{it}$ is the general error term) Random Effects estimator is a GLS type estimator and is more efficient that the pooled OLS estimator. In many cases, however , this condition does not hold. As such, people invoke a Fixed Effects estimator that effectively removes the nuisance paramter and uses within subject-over time variation. You can actually 'test' the condition by conducting a Hausmann test, which tests the weighted "squared'' difference between the fixed effect and random effects estimators. If you reject the null, you are better off using a fixed effects estimator. In any case, it is weakly better to use a Randome effects estimator than a pooled OLS one.