Confidence Interval – Combining 95% Confidence Interval Bounds from Multiple Measurements Correctly

I am looking for someone to just confirm / double-check something for me with regards to errors on measurements.

Let's say I am trying to determine the slope of a relationship by varying one quantity and measuring another, and then I plot the graph and do a least-squares fit straight line to the data (graph on the left). Then I repeat this procedure twice more, to get the middle and right-most graphs.

Each fit routune will typically give me back a slope and the corresponding 95% confidence interval, so that I obtain $(m_1\pm\Delta m_1), (m_2\pm\Delta m_2)$ and $(m_3\pm\Delta m_3)$. Now I know that the underlying quantity which determines $m$ in each case is the same, so I should be able to quote a best estimate for the slope as their mean

$$
\bar{m} = \frac{m_1+m_2+m_3}{3}. \tag{1}
$$

My question is about the appropriate way to quote the error. We know that for a function $f(x,y)$ with errors in $x$ and $y$ given by $\Delta x$ and $\Delta y$, respectively, the error on $f$ is given by

$$
\Delta f = \sqrt{ (\Delta x)^2 \bigg(\frac{\partial f}{\partial x}\bigg)^2 + (\Delta y)^2 \bigg(\frac{\partial f}{\partial y}\bigg)^2 } \tag{2}
$$

So I would think I can determine the error in $\bar{m}$ to be

$$
\begin{align}
\Delta \bar{m} &= \sqrt{ (\Delta m_1)^2 \bigg(\frac{\partial \bar{m}}{\partial m_1}\bigg)^2 + (\Delta m_2)^2 \bigg(\frac{\partial \bar{m}}{\partial m_2}\bigg)^2 + (\Delta m_3)^2 \bigg(\frac{\partial \bar{m}}{\partial m_3}\bigg)^2} \tag{3} \\
&= \frac{1}{3} \sqrt{ (\Delta m_1)^2 + (\Delta m_2)^2 + (\Delta m_3)^2 } \tag{4}
\end{align}
$$

First question, is this correct?

Second question, is it okay to propagate 95% confidence intervals in this way? Should I simply quote now the result as $\bar{m} \pm \Delta \bar{m}$ and just explain that $\Delta \bar{m}$ is the combined 95% confidence interval, or should I convert the 95% number from the fits into standard errors (through the factor of 1.96)?

Thanks in advance,

(I am for now assuming Gaussian errors everywhere.)

EDIT

It was suggested in the comments that I first implement weighting in the averaging step before worrying about the errors. This should help to give more weight to slopes which have tighter confidence intervals (and vice versa).

According to this link, the weighted version of the mean would be given by
$$
\bar{m}_\textrm{w} = \frac{\sum_i w_i m_i}{\sum_iw_i}, \hspace{1cm} \textrm{where} \hspace{0.5cm} w_i = \frac{1}{\sigma_i^2}\tag{5}
$$
and $\sigma_i$ is the variance of each slope. Therefore, in my case with the three example slopes, it should be
$$
\bar{m}_\textrm{w} = \frac{m_1/\sigma_1^2 + m_2/\sigma_2^2 + m_3/\sigma_3^2}{1/\sigma_1^2 + 1/\sigma_2^2 + 1/\sigma_3^2}. \tag{6}
$$

The variance on the weighted mean slope is given at the above link again by
$$
\begin{align}
\textrm{Var}(\bar{m}_\textrm{w}) &= \frac{\sum_iw_i^2\sigma_i^2}{\big( \sum_iw_i\big)^2}\tag{7}\\
&= \frac{1/\sigma_1^2 + 1/\sigma_2^2 + 1/\sigma_3^2}{\big(1/\sigma_1^2 + 1/\sigma_2^2 + 1/\sigma_3^2\big)^2}\tag{8}\\
&= \big(1/\sigma_1^2 + 1/\sigma_2^2 + 1/\sigma_3^2\big)^{-1}.\tag{9}
\end{align}
$$

So now my main question remains – these are variances, so should we convert the 95% confidence intervals $\Delta m_i$ returned by a fitting algorithm somehow into a variance?

Maybe for a concrete example we could imagine the following values were returned from the fitting routine:
$$
\begin{align}
m_1 &= 5.5\; (4.9, 6.1)\rightarrow \Delta m_1 = 0.6\\
m_2 &= 5.5\; (5.3, 5.7)\rightarrow \Delta m_2 = 0.2\\
m_3 &= 5.2\; (4.5, 5.9)\rightarrow \Delta m_3 = 0.7
\end{align}
$$
where the values in brackets represent the 95% confidence intervals. How should the estimate of the slope be reported, including errors? Let's imagine I only have access to these values (and not the underlying data that was used for fitting to obtain these slopes).

Best Answer

I imagine the 95% confidence intervals come from some assumptions on normality of data. Otherwise, please state how you got these CI. This implies you believe the mean of each slope (viewed as a RV) is $m_i$ with some variance $\sigma_i$ In this case you can average the slopes as you did and get the new variance of the averaged estimator (assuming independent errors). From said variance you can get 95% CI (using 1.96 standard deviations).

So, to summarize (assuming $m_i$ are independent is crucial):

Let $m := \frac{ \sum_{i=1}^N m_i \sigma_i^{-2}}{\sum_{i=1}^N \sigma_i^{-2}}$
Let $\sigma^2 := Var(m) = Var(\frac{ \sum_{i=1}^N m_i \sigma_i^{-2}}{\sum_{i=1}^N \sigma_i^{-2}}) = (\frac{1}{\sum_{i=1}^N \sigma_i^{-2}})^2 \sum_{i=1}^N \sigma^{-4}Var(m_i) = \frac{1}{\sum_{i=1}^N \sigma_i^{-2}}$.
Note that this is the harmonic mean! Finally you see it in the wild after learning that inequality in your first calculus class!!
A 95% CI for the true value of the slope is $[m- 1.96\sigma, m+ 1.96\sigma]$ (see 1.96)

Best Answer

Related Solutions

Relative Confidence Interval – Correct Terminology for Describing Relative Confidence Intervals

Related Question