Solved – $F$-test for hypothesis $\beta_1+\beta_2=2\beta_3$ in a regression

f-testhypothesis testingleast squareslinearregression

In a regression $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 + \epsilon$, how do I use an $F$-test to test the hypothesis $\beta_1+\beta_2=2\beta_3$? The standard $F$-test would test a hypothesis $H_0: \beta_1 = \beta_2 = \beta_3 = 0$

Best Answer

@Glen_b already provided a link to the discussion containing the theoretical aspects.

Here is a quick pratical example of how one would do it in R. Please also have a look at these documents which contain the theory as well as examples: Simultaneous Inference in General Parametric Models and Additional multcomp Examples.

We will use the mtcars dataset and build a linear regression model containing three variables: cyl (Number of cylinders), disp (Displacement) and hp (Horsepower) to predict the variable mpg (Miles/Gallon).

Then, we test the following hypothesis: $\beta_{\mathrm{cyl}}+\beta_{\mathrm{disp}}-2\cdot\beta_{\mathrm{hp}} = 0$.

Using the multcomp package, there are two ways of specifying the hypothesis:

  1. As a matrix
  2. by symbolic description

I included both version in the code below. In our example, the matrix would simply be a row vector: $\mathbf{K} = (0, 1, 1, -2)$. The zero at the beginning is necessary because our regression model includes an intercept.

By symbolic description means that you can simply state your hypothesis as a character string. In this case: "cyl + disp - 2*hp = 0".

In this example, the estimate of our hypothesis is $-1.2169$ with little evidence that it differs from $0$. The function confint is used to generate a confidence interval for the estimate: $(-2.86; 0.43)$.

#---------------------------------------------------------------------------------------
# Load "multcomp" package
#---------------------------------------------------------------------------------------

require(multcomp)

#---------------------------------------------------------------------------------------
# Load "mtcars" dataset
#---------------------------------------------------------------------------------------

data(mtcars)

#---------------------------------------------------------------------------------------
# Build linear regression model with three variables
#---------------------------------------------------------------------------------------

lm.mod <- lm(mpg~cyl+disp+hp, data = mtcars)

summary(lm.mod)

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 34.18492    2.59078  13.195 1.54e-13 ***
cyl         -1.22742    0.79728  -1.540   0.1349    
disp        -0.01884    0.01040  -1.811   0.0809 .  
hp          -0.01468    0.01465  -1.002   0.3250 

#---------------------------------------------------------------------------------------
# Define the general hypothesis
#---------------------------------------------------------------------------------------

K <- c("cyl + disp - 2*hp = 0") # As a formula

# K <- rbind(c(0, 1, 1, -2)) # As a contrast matrix
# rownames(K) <- c("cyl + disp - 2hp")
# colnames(K) <- names(coef(lm.mod))

#---------------------------------------------------------------------------------------
# Evaluate the general hypothesis and calculate confidence intervals
#---------------------------------------------------------------------------------------

glht.mod <- glht(lm.mod, linfct = K)

summary(glht.mod)

     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = mpg ~ cyl + disp + hp, data = mtcars)

Linear Hypotheses:
                         Estimate Std. Error t value Pr(>|t|)
cyl + disp - 2 * hp == 0  -1.2169     0.8036  -1.514    0.141
(Adjusted p values reported -- single-step method)

confint(glht.mod)

     Simultaneous Confidence Intervals

Fit: lm(formula = mpg ~ cyl + disp + hp, data = mtcars)

Quantile = 2.0484
95% family-wise confidence level

Linear Hypotheses:
                         Estimate lwr     upr    
cyl + disp - 2 * hp == 0 -1.2169  -2.8631  0.4293
Related Question