This answer comes in two parts. The first addresses the issue of the standard errors and why that implies the models are not identical, as @ndoogan observed in comments, and the second addresses, partially, the issue of when the coefficient estimates might be substantially different.
Consider, for example, a hypothesis test on the coefficient of log(Vmaj)
where the null is that the coefficient equals 0.5. The two sets of model estimates would result in rejection of the null in the Poisson case, unless testing at a very high confidence level, and failure to reject the null in the NB case.
More generally, there is more to a collection of estimates than just the point estimates. In the presence of (Negative binomial-like) overdispersion, the standard errors produced by the Negative Binomial-based model will be more accurate estimates of the underlying standard deviation of the coefficient estimates. Thus, the NB model as a whole is more accurate.
For example, a simple model with $\mu_i = \exp\{1 + x_i\}$, and $y_i \sim NB$ with mean $\mu_i$ and variance $5\mu_i$:
N <- 1000
mu <- exp(1 + (x <- rnorm(N)))
p <- 0.2
r <- mu * p / (1-p)
y <- rnbinom(N, r, p)
poisson_summary = summary(glm(y~x, family="poisson"))
nb_summary = summary(glm.nb(y~x))
# Parametric bootstrap calculation of the s.e. of the coefficient of x
coef_x <- rep(0,1000)
for (i in 1:length(coef_x)) {
mu <- exp(1 + (x <- rnorm(N)))
r <- mu * p / (1-p)
y <- rnbinom(N, r, p)
coef_x[i] <- summary(glm(y~x, family="poisson"))$coefficients["x","Estimate"]
}
data.frame("Sim. SE" = sd(coef_x - 1),
"Poisson SE" = poisson_summary$coefficients["x", "Std. Error"],
"N.B. SE" = nb_summary$coefficients["x", "Std. Error"])
Sim..SE Poisson.SE N.B..SE
1 0.03492467 0.01273354 0.03984743
where you can see that the simulated std. deviation of the coefficient estimate is roughly 3x the Poisson-based estimate of same, and much better estimated by the NB-based estimate of same.
The coefficient estimates are pretty much the same, as one might expect with this sample size and the std. errors above, so I won't take up space by displaying them.
Additionally, although this is often a minor effect when overdispersion is low, by weighting the observations with more accurate weights (i.e., better estimates of observation-specific variances), the parameter estimates themselves will be more accurate, well, asymptotically at any rate. The rule of thumb I learned long ago was that heteroskedasticity adjustments (for that is what they are, in essence) don't buy you much unless the differences between weights are on the order of 5x or more.
Note, however, that in small samples you may well get more accurate (in MSE terms) point estimates with the Poisson model if there isn't much overdispersion, because you are reducing the variability induced by estimating the dispersion parameter. Of course, this is almost certainly more than offset by the loss of accuracy in the standard errors of the coefficient estimates.
The Poisson model assumes equal mean and variance. If that doesn't hold, then the Poisson model isn't correct. Quasi-poisson is one possibility when there is overdispersion. Others include:
Negative binomial regression (NBR) - similar to Poisson model, but using the negative binomial distribution instead, which has a dispersion parameter. Available in the MASS
package in R, also integrated into Stata.
Hurdle regression - for circumstances with more 0s than would be expected from the Poisson/NB model. It combines a logit/probit with Poisson/NB, where the logit/probit is used to estimate y=0 vs y>0, and a truncated Poisson/NB is used to estimate the cases where y>0. Available in the pscl
package. Also available as a separate Stata add-on I cannot remember.
Zero-inflated - zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZNB) models are similar to the hurdle model, but assume two mechanisms at work to generate 0s: never-takers, and potential takers who didn't take in this instance. Available in the pscl
package, and available in Stata.
If your data are over-dispersed, try NBR, and compare the log-likelihoods (e.g. via AIC/BIC, etc.), and you can also get the statistical significance of the dispersion parameter from the NBR. From what I can tell, there is no disadvantage of using the NBR model relative to the Poisson model - so when I have overdispersion (in my limited experience, this has always been the case!), I simply use NBR and don't think twice. I could be wrong and would welcome others' thoughts. One of the downsides to the quasi-poisson is that it doesn't allow you to get likelihood-based stats, like the AIC/BIC. NBR uses MLE so it does.
This is a wonderful reference walking through how these models are used; it's an R vignette, but even if you don't use R it should be very useful.
Best Answer
When in doubt, simulate:
See How to obtain those nasty standard errors from transformed data - and why they should not be used regarding obtaining standard errors on the original scale. They also warn that confidence intervals calculated from these with the assumption of symmetry might strongly deviate from the real confidence intervals if the transformation is strongly non-linear.
In our example:
If you don't (or can't) exclude the intercept, things get messy and you'd need some kind of error propagation. But then, it would be better to obtain confidence intervals by other means anyway.