[Math] How to propagate uncertainties in the dependent variable when doing linear regression

error-propagationregressionregression analysisstatistics

Let's say I have an independent variable $\vec x$ and a dependent variable $\vec y$ and measurement errors on my dependent variable that I know to be $\delta y$. For the sake of simplicity, let's say (in R) that I have

x <- 1:10
y <- 1:10
delta_y <- 1

that is, each value $y_i$ is uncertain up to $\delta y = 1$.

But if I then do a simple linear regression, I obtain

summary(lm(y~x))

which has standard errors at the level of machine precision, like $1E{-}16$.

So how can I correct my standard errors to take into account the uncertainty on the dependent variable?

Best Answer

You have usual regression model $$ Y_i = bX_i + \varepsilon_i$$ but you can only measure $\tilde{Y}_i = Y_i+\delta_i$, with some measurement error $\delta_i$. Now the model becomes $$ \tilde{Y}_i = bX_i + \varepsilon_i+\delta_i$$ and if $\varepsilon_i$ and $\delta_i$ are independent, the only thing that changes is that the variance of the error term increases by $\mathrm{Var(\delta)}$, and $$\mathrm{Cov}(\hat{b})=(X^TX)^{-1} (\mathrm{Var}(\varepsilon)+\mathrm{Var}(\delta)).$$

Related Question