No. The minimum as always smaller than or equal to the arithmetic mean, and is strictly smaller with positive probability (i.e., when not all the $X_i$ have the same value). Hence its expected value is strictly smaller than that of the mean.
First some notation. Each example is drawn from some unknown distribution $Y$ with $E[Y] = \mu$ and $\textrm{Var}[Y] = \sigma^2$. Suppose the weighted mean consists of $n$ independent draws $X_i\sim Y$, and $\{w_i\}_1^n$ is in the standard simplex. Finally define the r.v. $X = \sum_i w_i X_i$. Note that $E[X] = \sum_i w_i E[X_i] = \mu$ and $\textrm{Var}[X] = \sum_i w_i^2 \textrm{Var}
[X_i] = \sigma^2\sum_i w_i^2$.
Generalizing the standard definition of sample mean, take
$$
\hat \mu(\{x_i\}_1^n) := \sum_i w_i x_i.
$$
Note that $E[\hat \mu(\{x_i\}_1^n)] = \sum_i w_i E[x_i] = \mu = E[X]$, so $\hat \mu$ is an unbiased estimator.
For the sample variance, generalize the sample variance as
$$
\hat \sigma^2_b(\{x_i\}_1^n) := \sum_i w_i (x_i - \hat \mu(\{x_i\}_1^n))^2,
$$
where the subscript foreshadows this will need a correction to be unbiased. Anyway,
$$
E[\hat \sigma^2_b] = \sum_i w_i E[(x_i - \hat \mu)^2] = \sum_i w_i E\left[\left(\sum_j w_j (x_i - x_j)\right)^2\right].
$$
The term in the expectation can be written as
$$
\sum_{j,k} w_j(x_i - x_j)w_k(x_i - x_k) = \sum_jw_j^2(x_i - x_j)^2 + \sum_{j\neq k} w_j w_k(x_i - x_j)(x_i - x_k).
$$
Passing in the expectation, the first term (when $x_i\neq x_j$, which would yield 0) is
$$
E[(x_i-x_j)^2] = 2E[x_i^2] - 2\mu^2 = 2\sigma^2,
$$
whereas the second (when $x_i \neq x_j$ and $x_i \neq x_k$, which would yield 0) is
$$
E[x_i^2 - x_ix_j - x_ix_k + x_jx_k] = E[x_i^2] - \mu^2 = \sigma^2.
$$
Combining everything,
$$
\sum_i w_i \left(2\sigma^2\sum_{j\neq i}w_j^2 + \sigma^2\sum_{j\neq k\neq i} w_j w_k\right)
= \sigma^2( 1 - \sum_j w_j^2).
$$
Therefore $E[\hat \sigma_b^2] - \sigma^2 = -\sigma^22\sum_j w_j^2$, i.e. this is a biased estimator. To make this an unbiased estimator of $Y$, divide by the excess term derived above:
$$
\hat \sigma_u^2(\{x_i\}_1^n)
:= \frac {\hat \sigma_b^2(\{x_i\}_1^n)}{1- \sum_j w_j^2}
= \frac {\sum_i w_i(x_i - \hat \mu)^2}{1- \sum_j w_j^2 }
$$
This matches the definition you gave (and a sanity check $w_i = 1/N$, recovering the normal unbiased estimate).
Now, if one instead were to seek an unbiased estimator of $X=\sum_i X_i$, the formula would instead be $\hat \sigma_b^2(\{x_i\}_1^n)(\sum_j w_j^2) / ( 1 - \sum_j w_j^2)$.
It is very odd for me that the documents you refer to are making estimators of $Y$ and not $X$; I don't see the justification of such an estimator. Also it is not clearly how to extend it to samples that don't have length $n$, whereas for the estimator of $X$, you simply have some number $m$ of $n$-samples, and averaging everything above makes things work out. Also, I didn't check, but it's my suspicion that the weighted estimator for $Y$ has higher variance than the usual one; as such, why use this weighted estimator at all? Building an estimator for $X$ would seem to have been the intent..
Best Answer
Hi, Rather long after your question, but it can be done directly in the same way Matus did it, or you can simply use the following:
Matus assumed weights $W_i$ which sum to $1$. Suppose you have weights Ui, and write $V_1 = \sum U_i$, and $V_2 = \sum U_i^2$, consistent with the Wikipedia entry for weighted sample variance. Then we can put $\displaystyle W_i = \frac{U_i}{V_1}$.
Now, look at the factor $\displaystyle \frac{1} {(1 - \sum W_i^2)}$, replace the $W_i$ with $\displaystyle\frac{U_i}{V_1}$, multiply top and bottom lines by $V_1^2$ and - voila! - you get $\displaystyle \frac{V_1^2}{ V_1^2 - V_2 }$ .
However, like Matus, I'm wondering when you would ever use such a "weighted sample variance" - see my question as a response to the original post.
I suspect there is much confusion over the different reasons for weighting.
Kathy