ANOVA – Interpreting NaN Values in Statsmodels Results

anovastatsmodels

I am trying to compare two models using statsmodels.stats.anova_lm. The output table I get is:

   df_resid         ssr  df_diff   ss_diff         F  Pr(>F)
0      72.0  113.319956      0.0       NaN       NaN     NaN
1      74.0  115.497953     -2.0 -2.177997  0.697726     NaN

I appreciate that there will always be NaNs in the 0th row. But I don't understand the NaN in the later row. Is it because it ran out of floats resolution?

Best Answer

This looks like it could be an error in how statsmodels produces p-values. Usually with an F-test, you need to supply the degrees of freedom for the test, and these degrees of freedom must be positive. statsmodels should automatically take the absolute value of the degrees of freedom and sums of squares, but maybe it didn't. Try switching the order of the models, which should produce the same values but make the degrees of freedom positive.

Related Question