Solved – Manually calculate the parameters (Std. Error) of lm output in R

rregression

I'm trying to expand my understanding of Linear regression and to that end I'm looking at calculating a Linear regression exercise by hand.

Using some dummy data

x <- c(17,13,12,15,16,14,16,16,18,19)
y <- c(94,73,59,80,93,85,66,79,77,91)
model.test <- lm(y ~ x)
summary(model.test)

The output gives me:

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   30.104     23.824   1.264    0.242  
x              3.179      1.514   2.100    0.069

Residual standard error: 9.859 on 8 degrees of freedom
Multiple R-squared:  0.3553,    Adjusted R-squared:  0.2747 

F-statistic: 4.409 on 1 and 8 DF, p-value: 0.06895

I can :

  • manually calculate the estimates of the intercept and x ok
  • I understand how to calculate the t value (i.e. estimate/Std. Error)
  • I also understand how to perform the hypothesis test to obtain a p-value for the Pr(>|t|) column.

My question is how can I calculate the two values in the Std. Error column?

For good measure I had a look at the lm.R code and I can see:

se <- sqrt(diag(R) * resvar)
resvar <- rss/rdf
rss <-sum(w * r^2)
rdf <- df[2L]

It looks like the formula is contained within sw, however I can't quite figure out what is happening as part of the terms resvar(residial variance?) or the square root of a matrix R from what I can gather.

Thanks in advance
Jonathan

PS I found a related post however it does not contain an answer -> how to manually calculate SE of coeficient from regress data outputs

PPS Manual workings below

observation X   Y   (x-x_mean)  (y-y_mean)  (x-x_mean)*(y-y_mean)   (x-x_mean)^2    (y-y_mean)^2    y_hat   y(hat) - y  y(hat) - y^2)
1           17  94  1.4 14.3    20.02   1.96    204.49  84.0024336  -9.997566399    99.9513339
2           13  73  -2.6    -6.7    17.42   6.76    44.89   71.32039595 -1.679604051    2.821069768
3           12  59  -3.6    -20.7   74.52   12.96   428.49  68.14988654 9.149886536 83.72042362
4           15  80  -0.6    0.3 -0.18   0.36    0.09    77.66141478 -2.338585225    5.468980855
5           16  93  0.4 13.3    5.32    0.16    176.89  80.83192419 -12.16807581    148.062069
6           14  85  -1.6    5.3 -8.48   2.56    28.09   74.49090536 -10.50909464    110.4410701
 7          16  66  0.4 -13.7   -5.48   0.16    187.69  80.83192419 14.83192419 219.9859751
 8          16  79  0.4 -0.7    -0.28   0.16    0.49    80.83192419 1.831924188 3.355946231
 9          18  77  2.4 -2.7    -6.48   5.76    7.29    87.17294301 10.17294301 103.4887696
10          19  91  3.4 11.3    38.42   11.56   127.69  90.34345243 -0.656547573    0.431054716

Total       156 797 3.55271E-15 -2.84217E-14    134.8   42.4    1206.1  795.6372042 -1.362795772    777.7266929
Mean        15.6    79.7    3.55271E-16 -2.84217E-15    13.48   4.24    120.61  79.56372042 -0.136279577    77.77266929
Std Dev     2.170509413 11.57631682 2.170509413 11.57631682 26.01030565 4.845341405 135.6597115 6.881620524 9.294807222 74.37605266
Variance    4.711111111 134.0111111 4.711111111 134.0111111 676.536 23.47733333 18403.55733 47.35670104 86.39344129 5531.797209

Best Answer

From doing some additional digging I found the Standard error of the parameter b can be obtained using the following formula:

SE = sb1 = sqrt [ Σ(yi - ŷi)^2 / (n - 2) ] / sqrt [ Σ(xi - x)^2 ] [1]

The Standard error of the intercept a can be found using the following formula:

SE = sa1 = S_e sqrt (( 1/n ]) + (x_mean)^2 / Σ(xi - x_mean)^2)

The Std error for b computed manually is 1.514 (which looks the same as the R regression out) The Std error for a computed manually is 23.827 (which is out by 0.003 I can only put this down to a rounding error on my behalf)

References:

[1] http://stattrek.com/regression/slope-confidence-interval.aspx?Tutorial=AP

[2] http://courses.ncssm.edu/math/Talks/PDFS/Standard%20Errors%20for%20Regression%20Equations.pdf

Related Question