Solved – Regression analysis of two data sets

regression

I have two data sets, both resemble $y = x^2$, but say I don't know that yet. They are both evenly sampled, but the sampling rates are not the same between them, and they exist over different domains. Both sets may return data at the same $x$ value. How do I perform a regression over such a data set?

If I combine them into one large dataset, the fit will be weighted towards the dataset that has the higher density points (or the one with more total points maybe?) But, I want the two datasets to be weighted evenly in the $x$ dimension – that is, even though there is much higher data density around the point $x=5$ for the red set compared to the black set, I would want the black point to have as much weight as the red points between say $x = 4.5$ and $x=5.5$. I feel this is a very naive way of saying it, but perhaps someone with a stats background can fix this up.

Example data set.

Best Answer

You could use weighted regression to achieve your goal. Apart from the data, for each observation a weight is given that indicates how important the error for that observation is. More information and an example here: How to use weights in function lm in R?

Related Question