I have 12 soil water sensors with a few years of actual soil water samples that have been retrieved from near each of the sensors. We have found that individually regressing the data from each sensor vs the soil samples performs better than when the data from all sensors is placed in one regression. We calculated an RMSE for all of the sensor data using a single regression, but we also calculated it from each sensor's residuals from the individual sensor regressions in a single combined RMSE. Is this an inappropriate use of RMSE?
Solved – Using a combined RMSE
rms
Related Solutions
You can't sum mean squared errors like that unless your variables are all in the same unit, on the same scale. The unit of your RMSE is the square root of the sum of the squared units of its components. This is a completely meaningless unit in most practical applications I can think of.
You could center and rescale all your variables first, to get RMSE in terms of number of standard deviations from the mean. Personally, I'm not sure if this is such a great idea. I think it depends on what you're using this "overall" measure of fit for, since there's no absolute "good" and "bad" RMSE. If you're going to be comparing different imputation models, it might not be a bad approach. Then again, if you're comparing imputation models for the purpose of fitting a model, you're better off (in my source-less opinion) just fitting the model with each imputation method and comparing the final model fits.
The question you linked refers to a "different" overall RMSE. That answer is explaining how to properly average the RMSE's from a cross-validation procedure, on a single variable ($y$ in the answer's notation).
I can't think of any reason to take your simulation's data-generating process into account. The point of simulation studies are to see how your model performs on new data. You don't know the underlying data-generating process. Therefore your estimate of model performance should not take into account things you wouldn't plausibly know when you're fitting your model. I also can't think of how you'd incorporate the missingness stratification if you wanted to, and how to interpret the resulting quantity.
Use the RMSE.
Note that the (R)MSE and the MAPE will be minimized by quite different point forecasts (see my answer at Higher RMSE lower MAPE). You should first decide which functional of the unknown future distribution you want to elicit, then choose the corresponding error measure.
However, note that an ARIMA model will output a conditional expectation forecast, i.e., the functional that optimizes (R)MSE. It makes little sense to train a model to minimize the (R)MSE, then to assess its forecasts with a different error measure (Kolassa, 2020, IJF). If you truly want to find a MAPE-optimal forecast, you should also use the MAPE to fit your model. I am not aware of any off-the-shelf forecasting software that does this (if you use an ML pipeline, you may be able to specify any fitting criterion and choose the MAPE), and I have major doubts as to the usefulness of a MAPE-minimal forecast.
Best Answer
I think the basic idea of wanting to calculate a grand RMSE for all the sensors, despite how you've fit regression models separately for each sensor, is fine. But be aware of three things:
If you compute the grand MSE by simply taking the mean of the 12 sensor MSEs, you won't be accounting for how different sensors may have different amounts of data. If you want to (you probably do), you should weight the MSEs by sample size, or equivalently, put all the squared errors into one vector and take the mean (and then square root) of that.
You say that performing a separate regression for each sensor "performs better" than an overall regression. If your measure of performance is just RMSE, with each of your two models (one regression vs. separate regressions) trained and tested on the same data, then it is a given that the more flexible approach of doing separate regressions will produce a RMSE no greater than that of the overall regression. Your models are nested, so the more flexible one is guaranteed to fit the data at least as well as the less flexible one. This does not imply that, for example, the more flexible model is more correct than the less flexible one, nor that its coefficients are more informative, nor that it will be more accurate in predicting future observations. In short, your more flexible model may be overfitting.
Instead of fitting completely separate regression models for each sensor, you may be better served by using a mixed model, with the effect of each sensor being a random effect.