Solved – Large sample size and partial F-test for multiple regression makes adding a variable always significant

f-testmultiple regression

I am developing a two variable multiple regression model.
ie.
$$ Y = b0 – b1 * X1 + b2 * X2 $$

I am using the following formula for partial F-test from http://luna.cas.usf.edu/~mbrannic/files/regression/Reg2IV.html under the section Testing Incremental R2. The F-statistics calculated is supposed to tell me if adding the second variable is significant (more details in that link).

$$ F= {\frac{(R_L^2 – R_S^2)/(k_L-k_s)}{(1-R_L^2)/(N-k_L-1)}}$$

My first variable has a strong correlation:
regression_coeff_string: b1 = 0.664, b0 = 0.035
R2_val: 0.564

My second variable has a weak correlation:
regression_coeff_string: b1 = -25.026, b0 = 0.469,
R2_val: 0.027

Adding my seond variable only marginally improves the R2 value
regression_coeff_string: b0 = 0.0559, b1 = 0.6633, b2 = -5.2222,
R2_val: 0.565

However, because I have a sample size 2949, that
With $$ R_L^2 = 0.565, R_S^2 = 0.564$$
$k_L$ the number of predictors in the full set being 2,
$k_S$ the number of predictors in the subset being 1
$$ F= {\frac{(0.565 – 0.564)/(2-1)}{(1-0.565)/(2949-2-1)}} = 6.77$$

With F(1,2946) at 0.05 confidence having a F_stat of 4.182, the result is significant. But it seems that it is only because the sample size is large. If I sort the second variable X2 in ascending order in Excel and leave the order of the Y and X1 variables unchanged, I would still get a significant F score.

Question: How can I do a fair incremental R2 test for the addition of a new variable in multiple regression when the sample size becomes large?

Simply looking at the R2 of each X variable individually does not take into account that that they may be cross-correlated, that is why I turned to the incremental R2 test to see how the overall R2 improves relative to adding a new variable.

EDIT1:

The context of my example is predicting solar radiation. The first variable is a solar radiation variable from NWP (numerical weather prediction) software (ie. high correlation). The other variables are other NWP output variables and we are trying to improve our prediction.

Best Answer

The test you are doing is "fair", it's just that p-values don't answer the question you want to ask (they often don't). The way to proceed is to figure out what change in effect size is substantively meaningful and base decisions on that.

This is entirely dependent on your field and, indeed, on your question. To illustrate: If 1 in 1000 children misunderstand a question on a test, that is a very small proportion, and won't affect the validity of the test much. But if 1 in 1000 airplane trips end in a crash, that is a very large proportion and would end aviation.

Is there any context in which a change of $R^2$ from 0.564 to 0.565 is important? I can't think of one, offhand, but I haven't had all my coffee :-). Perhaps some variation on the plane crash scenario.

Related Solutions

Solved – Sample size formula for an F-test

I am wondering if there is a sample size formula like Lehr's formula that applies to an F-test?

The webpage "Power Tools for Epidemiologists" explains:

Difference Between Two Means (Lehr):

Say, for example, you want to demonstrate a 10 point difference in IQ between two groups, one of which is exposed to a potential toxin, the other of which is not. Using a mean population IQ of 100, and a standard deviation of 20:

$$n_{group}=\frac {16}{(100−90/20)^2}$$

$$n_{group}=\frac{16}{(.5)^2}=64$$
Percentage Change in Means

Clinical researchers may be more comfortable thinking in terms of percentage changes rather than differences in means and variability. For example, someone might be interested in a 20% difference between two groups in data with about 30% variability. Professor van Belle presents a neat approach to these kinds of numbers that uses the coefficient of variation (c.v.) 4 and translating percentage change into a ratio of means.

Variance on the log scale (see chapter 5 in van Belle) is approximately equal to coefficient of variation on the original scale, so Lehr’s formula can be translated into a version that uses c.v.

$$n_{group}=\frac{16(c.v.)^2}{(ln(μ_0)−ln(μ_1))^2}$$

We can then use the percentage change as the ratio of means, where

$$r.m.=\frac{μ_0−μ_1}{μ0}=1−\frac{μ_1}{μ_0}$$

to formulate a rule of thumb:

$$n_{group}=\frac{16 (c.v.)^2}{(ln(r.m.))^2}$$

In the example above, a 20% change translates to a ratio of means of 1−.20=.80. (A 5% change would result in a ratio of means of 1−.05=.95; a 35% change 1−.35=.65, and so on.) So, the sample size for a study seeking to demonstrate a 20% change in means with data that varies about 30% around the means would be

$$n_{group}=\frac{16(.3)^2}{(ln(.8))^2}=29$$

An R function based on this rule would be:
1   nPC<-function(cv, pc){
2       x<-16*(cv)^2/((log((1-pc)))^2)
3       print(x)
4   }
Say you were interested in a 15% change from one group to another, but were uncertain about how the data varied. You could look at a range of values for the coefficient of variation:
1   a<-c(.05,.10,.15,.20,.30,.40,.50,.75,1)
2   nPC(a,.15)
You could use this to graphically display your results:
1   plot(a,nPC(a,.15),  ylab="Number in Each Group", 
2   xlab="By Varying Coefficent of Variation", 
3   main="Sample Size Estimate for a 15% Difference")

See also: iSixSigma "How to Determine Sample Size" and RaoSoft "Online Sample Size Calculator".

Solved – Does the F-test for multivariable regression work with non-normal residuals but large sample size

I am not entirely sure this answers your question, but here we go. Asymptotically, $q\times F$, with $F$ an F-distributed r.v. with $q$ "numerator degrees of freedom corresponding to the number of restrictions tested in the F-test, converges in distribution to a $\chi^2_q$ r.v. as the denominator degrees of freedom (a function of $n$, the sample size) tend to infinity.

Rearranging this result gives that $F$ itself converges to a $\chi^2_q/q$-distributed r.v., or a $F_{q,\infty}$ r.v. This r.v. can be well approximated by a $F_{q,n}$ r.v. for $n$ "sufficiently large" (as with most asymptotic approximations, there is no general answer to what "sufficiently large" precisely is), just as a $t$-distributed r.v. can be well approximated by a standard normal when degrees of freedom of $t$ are sufficiently large.

The figure below shows that even for 25 denominator degrees of freedom, the approximation is already quite close, for $q=3$.

EDIT:

To (hopefully) address Glenb's valid comment: why would $F$ be approximately $\chi^2_q/q$? We may write $F$ as $$ F=\frac{(R'\hat{\beta}-r)'\left\{R'(X' X)^{-1}R\right\}^{-1}(R'\hat{\beta}-r)/q}{\hat{\sigma}^{2}}, $$ where $R$ and $r$ specify the null $H_0:R'\beta=r$ to be tested.

Let us take the leading case of $R=I$, so testing restrictions on the coefficients as opposed to, say, testing restrictions on the sum of coefficients. If the null is true, $r=\beta$, so that we may write $$ qF=\frac{(\hat{\beta}-\beta)'\left\{(X' X)^{-1}\right\}^{-1}(\hat{\beta}-\beta)}{\hat{\sigma}^{2}}=\sqrt{n}(\hat{\beta}-\beta)'\left\{\hat{\sigma}^{2}(X' X/n)^{-1}\right\}^{-1}\sqrt{n}(\hat{\beta}-\beta) $$ Now, we know that $\sqrt{n}(\hat{\beta}-\beta)$ is asymptotically normal with an asymptotic variance that may be estimated consistently by $\hat{\sigma}^{2}(X' X/n)^{-1}$. So, the middle term $\bigl\{\hat{\sigma}^{2}(X' X/n)^{-1}\bigr\}^{-1}$, loosely speaking, "standardizes" both asymptotic multivariate normal random vectors in the quadratic form by their "standard deviation", so that we (again asymptotically) obtain standard normal vectors.

In large but finite samples, this vector will only approximately be standard normal. Hence, the resulting sum of squares of the standard normal independent entries of the vector will also only approximately be $\chi^2_q$.

But for $n$ reasonably large, the approximation will be pretty good.

Best Answer

Related Solutions

Solved – Sample size formula for an F-test

Solved – Does the F-test for multivariable regression work with non-normal residuals but large sample size

Related Question