- Root mean square error
- residual sum of squares
- residual standard error
- mean squared error
- test error
I thought I used to understand these terms but the more I do statistic problems the more I have gotten myself confused where I second guess myself. I would like some re-assurance & a concrete example
I can find the equations easily enough online but I am having trouble getting a 'explain like I'm 5' explanation of these terms so I can crystallize in my head the differences and how one leads to another.
If anyone can take this code below and point out how I would calculate each one of these terms I would appreciate it. R code would be great..
Using this example below:
summary(lm(mpg~hp, data=mtcars))
Show me in R code how to find:
rmse = ____
rss = ____
residual_standard_error = ______ # i know its there but need understanding
mean_squared_error = _______
test_error = ________
Bonus points for explaining like i'm 5 the differences/similarities between these. example:
rmse = squareroot(mss)
Best Answer
As requested, I illustrate using a simple regression using the
mtcars
data:The mean squared error (MSE) is the mean of the square of the residuals:
Root mean squared error (RMSE) is then the square root of MSE:
Residual sum of squares (RSS) is the sum of the squared residuals:
Residual standard error (RSE) is the square root of (RSS / degrees of freedom):
The same calculation, simplified because we have previously calculated
rss
:The term test error in the context of regression (and other predictive analytics techniques) usually refers to calculating a test statistic on test data, distinct from your training data.
In other words, you estimate a model using a portion of your data (often an 80% sample) and then calculating the error using the hold-out sample. Again, I illustrate using
mtcars
, this time with an 80% sampleEstimate the model, then predict with the hold-out data:
Combine the original data and prediction in a data frame
Now compute your test statistics in the normal way. I illustrate MSE and RMSE:
Note that this answer ignores weighting of the observations.