The difference between coefficients is in the relation x versus y which is reversed in the one case.
Note that
- in your R case the coefficient relates to 'suva'
- and in your Excel case the coefficient relates to 'heather'.
see in the following code where R can get to both cases:
lm(suva ~ heather, data = as.data.frame(data))
Call:
lm(formula = suva ~ heather, data = as.data.frame(data))
Coefficients:
(Intercept) heather
14.65 -13.60
> lm(heather ~suva, data = as.data.frame(data))
Call:
lm(formula = heather ~ suva, data = as.data.frame(data))
Coefficients:
(Intercept) suva
0.32524 -0.01276
rest of the code:
data <- c(
12.880545, 0.061156645, 0.15 , 0.525, 0,
7.098873327, 0.026878039, 0.2275, 0 ,0,
8.660688381, 0.04037841 , 0.425 , 0.25 , 0,
7.734546932, 0.021618446, 0.225 , 0.3875, 0,
16.70696048, 0.103626684, 0.15 , 0.075, 0,
9.763315183, 0.013387158, 0.25 , 0.075, 0,
12.91735434, 0.008076468, 0.22 , 0.22 , 0,
19.94153851, 0.150798057, 0.0375, 0.35 , 0.225,
17.25115559, 0.052229596, 0.0625, 0.2625, 0.225,
15.38596941, 0.05429447 , 0.1125, 0.45 , 0.025,
15.53714185, 0.05933884 , 0.1625, 0.525, 0.0625,
14.11551229, 0.064579437, 0.1875, 0.35 , 0.1375,
14.88575569, 0.0189853 , 0.3375, 0.3, 0,
12.32229733, 0.043085602, 0.0875, 0.1375, 0,
17.23861185, 0.071705699, 0.15 , 0.1375, 0,
11.50832463, 0.1125 , 0.0875, 0.075, 0,
14.4810484, 0.078476821, 0.0375, 0.125, 0.0625,
9.110262652, 0.077306938, 0.145 , 0.35 , 0.0125,
10.8571733, 0.02681341 , 0.0375, 0.525, 0,
9.589339421, 0.01892435 , 0.2275, 0 , 0,
7.260373588, 0.014538237, 0.425 , 0.25 , 0,
11.11099161, 0.022802578, 0.225 , 0.3875 , 0,
10.81488848, 0.047587818, 0.15 , 0.075 , 0,
8.224131957, 0.031126904, 0.25 , 0.075 , 0,
8.818607863, 0.002855409, 0.22 , 0.22 , 0,
11.53999863, 0.031465613, 0.0375, 0.35 , 0.225,
14.92784964, 0.069998663, 0.0625, 0.2625 , 0.225,
9.666480932, 0.02387741 , 0.1125, 0.45 , 0.025,
12.51000758, 0.016960259, 0.1625, 0.525 , 0.0625,
13.32611463, 0.033670382, 0.1875, 0.35 , 0.1375,
16.76535191, 0.029613698, 0.3375, 0.3 ,0,
11.24615281, 0.008440059, 0.0875, 0.1375, 0,
10.60564875, 0.003930792, 0.15 , 0.1375, 0,
11.82909125, 0.036017582, 0.1125, 0.0875 , 0.075,
18.2337185, 0.143451512, 0.0375, 0.125 , 0.0625,
10.6226222, 0.020561242, 0.145 , 0.35 , 0.0125
)
data <- matrix(data,36, byrow=1)
colnames(data) <- c("suva", "Std dev", "heather", "sedge", "sphagnum")
Why then, is $R^2$ still the same?
There is a certain symmetry in the situation. The regression slope coefficient is (in simple linear regression) the correlation coefficient scaled by the variance of the $x$ and $y$ data.
$$\hat\beta_{y \sim x} = r_{xy} \frac{s_y}{s_x}$$
The regression model variance is then:
$$s_{mod} = \hat\beta_{y \sim x} s_x = r_{xy} s_y$$
and the ratio of model variance and variance of the data is:
$$R^2 = \left( \frac{s_{mod}}{s_y} \right)^2= r_{xy}^2$$
Best Answer
Polynomial regression is in effect multiple linear regression: consider $X_1=X$ and $X_2=X^2$ -- then $E(Y) = \beta_1 X + \beta_2 X^2$ is the same as $E(Y) = \beta_1 X_1 + \beta_2 X_2$.
As such, methods for constructing confidence intervals for parameters (and for the mean in multiple regression) carry over directly to the polynomial case. Most regression packages will compute this for you. Yes, it can be done using the formula you suggest (if the assumptions needed for the t-interval to apply hold), and the right d.f. are used for the $t$ (the residual d.f. - which in R is available from the summary output).
The R function
confint
can be used to construct confidence intervals for parameters from a regression model. See?confint
.In the case of a confidence interval for the conditional mean, let $X$ be the matrix of predictors, whether for polynomial regression or any other multiple regression model; let the estimated variance of the mean at $x_i=(x_{1i},x_{2i},...,x_{pi})$ be $v_i=\hat{\sigma}^2x_i(X'X)^{-1}x_i'$ and let $s_i=\sqrt v_i$ be the corresponding standard error. Let the upper $\alpha/2$ $t$ critical value for $n-p-1$ df be $t$. Then the pointwise confidence interval for the mean at $x_i$ is $\hat{y}_i\pm t\cdot s$.
Also, the R function
predict
can be used to construct CIs for E(Y|X) - see?predict.lm
.[At least when doing polynomial regression with an intercept, it makes sense to use orthogonal polynomials but if the spread of $X$ is large compared to the mean, and the degree is low (such as quadratic), it won't be so critical (I tend to do so anyway, because it's easier to interpret the linear and quadratic).]