Certainly the residuals are some sort of estimators of $\epsilon$ (to be clear, the definition of the residual is the estimator, the observed residual is an estimate). If the model is correct, then they may sometimes be a fairly good estimate.
Indeed
$e = y - \hat y = X\beta + \epsilon - X(X'X)^{-1} X'(X\beta + \epsilon) = (I - H)\epsilon$,
where $H = X(X'X)^{-1} X'$ is the
hat-matrix (because it 'puts the hat on' $y$) -- also sometimes called the projection matrix.
http://en.wikipedia.org/wiki/Hat_matrix
That is, the $e$'s are each a linear combination of the $\epsilon$'s; if $1-h_{ii}$ is reasonably big relative to $\sum_{j\neq i}h_{ij}$ (if $H$ is 'small' relative to I), then most of the weight is on the $i^\textrm{th}$ error (this is frequently not the case, though).
Note that $e_i/\sqrt{1-h_{ii}}$ will have the same expectation and variance as $\epsilon_i$ and if the elements of $H$ are small, in the manner just described, will be highly correlated with it -- in fact, if I have done my algebra right, the correlation between $e_i$ and $\epsilon_i$ is actually: $\text{corr}(e_i,\epsilon_i) = \sqrt{1-h_{ii}}$.
Squared difference divided by $n$ or by $n-1$ are both variance. The only difference is that in the second case it is an unbiased estimator of variance. Taking square root of it leads to estimating standard deviation.
I guess that mean squared deviation and root mean squared deviation are used more commonly in machine learning field where you have mean squared error and it's square root that are often used.
I also guess that some people prefer using mean squared deviation as a name for variance because it is more descriptive -- you instantly know from the name what someone is talking about, while for understanding what variance is you need to know at least elementary statistics.
Check the following threads to learn more:
Best Answer
Actually it's mentioned in the Regression section of Mean squared error in Wikipedia:
You can also find some informations here: Errors and residuals in statistics It says the expression mean squared error may have different meanings in different cases, which is tricky sometimes.