[Physics] error propagation with different plus and min errors and data fitting

error analysis

I am refreshing my memory on error propagation and data fitting (Levenberg-Marquadt).
You have the absolute (measurement) error, the relative (measurement) error, the population/sample standard deviation and the population/sample standard error.

These are typically written down as:

$$a \pm b = a (+b, -b)$$
But suppose you have different errors in the plus and min direction?

$$a (+b, -c)$$

What are the rules here again and how are the formula for error propagation modified? Also, how is this taken into account in data fitting? Any good resource on this is welcome.

Best Answer

I think what you could do is treat the quoted value as the most likely value and the two uncertainties as semi-gaussian probability distributions.

So for example in error propagation you would have to propagate the upper and lower limits separately using the usual Gaussian error propagation formulae but keeping track of whether these propagate to the upper or lower error estimate in the thing you are trying to calculate.

e.g. the uncertainties in $a^2$ would be $+2ab$, $-2ac$, but the uncertainties in $a^{-1}$ would be $+c/a^2$, $-b/a^2$.

I am not certain (no pun intended) that this procedure is correct unless you are absolutely sure that each side of the measurement uncertainty distribution is accurately represented by a semi-Gaussian. A more general technique is discussed by Barlow (2004). I have not fully digested this, but the examples appear to be based on Poisson distributions, which are of course inherently non-Gaussian for small numbers of events.

If you knew what the actual probability distribution functions were then the best way to proceed would surely be to do a Monte-Carlo simulation.

For calculating chi-squared values in data fitting you either use the positive or negative errors depending on whether the deviation from the model is negative or positive respectively. Again, only partly satisfactory because the asymmetric error bar would usually indicate that the probability distribution of the measurement is asymmetric and thus inherently non-Gaussian.

A better (though still frequentist) route to fitting a model to such data would involve maximising the likelihood function, which is calculated by integrating over the actual probability distribution of the measurements (if you know it, or at least a better representation of it than two half Gaussians).

Monte Carlo integration would again be the method of choice if your life (or career) depended on whether the resultant confidence intervals in the parameters were 68% or 75%.

Related Question