You might check out David Freedman's paper, "A Note on Screening Regression Equations." (ungated)
Using completely uncorrelated data in a simulation, he shows that, if there are many predictors relative to the number of observations, then a standard screening procedure will produce a final regression that contains many (more than by chance) significant predictors and a highly significant F statistic. The final model suggests that it is effective at predicting the outcome, but this success is spurious. He also illustrates these results using asymptotic calculations. Suggested solutions include screening on a sample and assessing the model on the full data set and using at least an order of magnitude more observations than predictors.
You posed two questions, so I will simply comment on them in order:
Question 1: High degrees of freedom:
Such high degrees of freedoms are normal with pool.compare
. The function implements the procedure by Meng & Rubin (1992), in which the denominator degrees of freedom for the test statistic $D_m$ are derived under the assumptions that the complete-data degrees of freedom are infinite (see also Rubin, 1987).
Thus, the procedure will estimate the degrees of freedom smaller than in the hypothetical complete-data (i.e., smaller than infinity), which often results in relatively large denominator degrees of freedom in MI. Sometimes this is inappropriate, especially in smaller samples.
Question 2: Correction formula of Barnard & Rubin:
The correction formula in Barnard & Rubin (1999) adresses the aforementioned problem, but not for multiparameter tests (as done in pool.compare
) but for tests for scalar estimands (e.g., a single regression coefficient).
Therefore, this correction formula is not the way to go here. Luckily, there is also a correction formula available for multiparameter tests. That formula was proposed by Reiter (2007) and was originally developed for the procedure by Li, Raghunathan, and Rubin (1991).
However, these two procedures are asymptotically identical in many cases, and the expression for the degrees of freedom is the same in $D_1$ and $D_3$. Therefore, I would suggest you apply Reiter's correction formula to the results in pool.compare
. The formula is not much more difficult to apply than that of Barnard & Rubin, and it is also implemented in a couple of R packages.
You can find some very readable applications of Reiter's correction formula in the article by van Ginkel and Kronenberg (2014), who apply the procedure of Li et al (1991) with Reiter's corrections to the ANOVA (recall that Meng & Rubin, 1992, and Li et al., 1991, can be thought interchangeable in this case).
Edit:
However, it is not unlikely that you will observe no big difference. The outcome of your hypothesis test will likely remain the same.
Best Answer
Do you disagree with @FrankHarrel's answer that parsimony comes with some ugly scientific trade-offs, anyways?
I love the link provided in @MikeWiezbicki's comment to Doug Bates' rationale. If someone disagrees with your analysis, they can do it their way, and this is a fun way to start a scientific discussion about your base assumptions. A p-value does not make your conclusion an "absolute truth".
If the decision of whether or not to include a parameter in your model comes down to "picking hairs" over what are, for scientifically meaningful samples, relatively small discrepancies in the df -- and you are not dealing with $n<p$ problems that justify more nuanced inference, anyways -- then you have a param so close to meeting your cutoffs that you should be transparent and talk about it either way: just include it, or analyze the model with and without it, but definitely transparently discuss your decision in the final analysis.