The Weighted Average to find the mean

least squaresprobabilitystatisticsvariance

I am reading a book called The Statistical Analysis of Experimental Data by John Mandel. Opinions aside, they provide an example to find the average of population given three different measurement techniques is to use the weighted average,

$$\mu=\frac {w_A\bar x_A+w_B\bar x_B+w_C\bar x_C}{w_A+w_B+w_C}$$
This much is clear to me and something I have done for ages in determining aggregate statistics of a set of data, i.e. price for a group of bonds, yield, maturity, etc.

But then they state, the "weights" are the reciprocals of the variances of $\bar x_A, \bar x_B,$ and $\bar x_C$ i.e.

$$w_A=\frac{N_A}{\sigma^2_A}; etc$$ for the other two. They explain this in a later chapter, Method of Least Squares, and I read the section on weighted average. Now they are trying to find the circumference of a disc with in three different ways. In order to find the true value of the circumference $P$ it is equal to the expected value of $\hat p$. This makes sense, another way to state the average in lamen's terms. In order to find the best value we need to find the best coeffiencets that will yield the highest precision for $\hat p$, or minmize the variance of $\hat p$. Now, consider a linear combination (Eqn 1) of each of the three measurements, i.e. $$\hat p=a_1p_1+a_2p_2+a_3p_3$$

To select the coefficients that will yield the highest precision for $\hat p$.
Through a series of steps, which I understand but won't re-write here, they use a system of equations to get one of the coefficients as a linear combination of the other two because they determine the coefficients sum to 1. Next step is to plug this into the Variance equation and take the derivative with respect to one of the coefficients and set equal to zero. Lastly, through a series of algebra we get that $$a_1=\frac{1/\sigma^2_1}{1/\sigma^2_1+1/\sigma^2_2+1/\sigma^2_3}$$ and likewise for the other two. The algebra I understand. Then we let each $w_1={1/\sigma^2_1}$, etc to get our standard notion of weights as $$a_1=\frac{w_1}{w_1+w_2+w_3}$$ and likewise for the rest.

My question derives from the actual use of the weighted average involving standard deviations. I have never encountered two sets of data with a different measurement, hence my unfamiliarity with taking the weighted average of sample means to get an estimate of the population mean. I can see why we don't weight each measurement technique the same, we weight it off an error associated with it. Yet in creating a simple model, I have always let $\Sigma w_i=1$ and used the linear combination, Eqn 1, to solve for my value or used a sumproduct then divided by the sum to find something like aggregate price when the quantities are different. Those methods are different then this one, but can someone provide insight to this and it's uses in method of least squares.

Best Answer

Intuitively, consider three scientists’ equally confident interval estimates of an unknown quantity $x$, each of which was based on sound research. Abdul estimated $x$ to be $x_A\pm e_A$. Bing estimated $x$ to be $x_B\pm e_B$, and Charlene estimated $x$ to be $x_C\pm e_C$. Without additional information, a reasonable naive assumption is that the difference in interval widths is due to different effective sample sizes, and that each $e_i$ is twice the standard error: $e_A=\frac{2\sigma}{\sqrt N_A}$, $e_B=\frac{2\sigma}{\sqrt N_B}$, and $e_C=\frac{2\sigma}{\sqrt N_C}$, where $\sigma$ is the “true” standard deviation of $x$. If Bing’s interval estimate is half as wide as Abdul’s, her sample size, $N_B$ must have been $4$ times as large as $N_A$, hence her point estimate of $x$ should be weighted $4$ times as heavily as his, because it represents $4$ times as many subjects.

Related Question