If you wanted to show only that the sample mean has a smaller variance than every other weighted average of the observations, then this would be an exercise in Lagrange multipliers. But if you want to include all unbiased estimators of $\mu$ based on $X_1,\ldots,X_n$ (for example, the sample median is one such estimator, and is not a weighted average of the observations), then this becomes equivalent to the one-to-one nature of the two-sided Laplace transform.
Observe that the conditional distribution of $(X_1,\ldots,X_n)$ given $\bar X = (X_1+\cdots+X_n)/n$ does not depend on $\mu$. (I could add the details of how to find the conditional distribution if necessary.) In other words, the sample mean $\bar X$ is a sufficient statistic for $\mu$. Therefore, the Rao–Blackwell theorem tells us that any minimum-variance estimator is to be found only among functions of $\bar X$.
Therefore it is enough to show that the only function $g(\bar X)$ of $\bar X$ (where of course, which function $g$ is, is not allowed to depend on $\mu$; i.e. $g(\bar X)$ is actually a statistic) that is an unbiased estimator of $\mu$ is $\bar X$ itself.
The density function of $\bar X$ is
$$
x\mapsto \text{constant}\cdot \exp\left(\frac{-1}{2}\cdot\left(\frac{x-\mu}{1/\sqrt{n}}\right)^2\right).
$$
In order that the function $g(\bar X)$ be an unbiased estimator of $\mu$, we must $g(\bar X)-\bar X$ being an unbiased estimator of $0$. Let $h(x) = g(x)-x$; then we must have
$$
\int_{-\infty}^\infty (\text{same constant})\cdot h(x) \exp\left(\frac{-1}{2}\cdot\left(\frac{x-\mu}{1/\sqrt{n}}\right)^2\right) \, dx = 0
$$
for all values of $\mu$. Hence
$$
\text{same constant}\cdot \exp\left(\frac{-n\mu^2}{2}\right) \cdot \int_{-\infty}^\infty \left(h(x) \exp\left(\frac{-n}{2} x^2\right)\right) \exp\left(nx\mu\right) \, dx = 0
$$
regardless of the value of $\mu$. Thus the two-sided Laplace transform of the function
$$
x\mapsto h(x)\exp\left( \frac{-nx^2}{2} \right)
$$
is $0$ for all values of $\mu$.
Since the two-sided Laplace transform is one-to-one, it can map only one function to the identically zero function.
"What am I doing wrong?" Nothing here. Your approach is correct: $\hat{\theta}_1$ is an unbiased estimator, for the reason you give (basically, it has to do with linearity of expectation). When $X_1,\dots,X_n$ are identically distributed and have finite expectation $\mu$,
$$
\mathbb{E}\!\left[\frac{1}{n}\sum_{k=1}^n X_k\right] = \frac{1}{n}\sum_{k=1}^n \mathbb{E}[X_k] = \frac{1}{n}\sum_{k=1}^n \mu = \mu
$$
For the same reason, $\hat{\theta}_2$ is unbiased as well:
$$
\mathbb{E}\!\left[\hat{\theta}_2\right] = \frac{2\mathbb{E}[X_1] - \mathbb{E}[X_6]+\mathbb{E}[X_4]}{2} = \frac{2\mu - \mu+\mu}{2} = \mu\ .
$$
Note that while both are unbiased estimators, the first one may be better, as it will have smaller variance... i.e., while both expectations are the desired value $\mu$, the second will be much less accurate (more subject to random fluctuations).
-- Attempt at addressing the larger question: assume the goal is to estimate a quantity $\theta$ of the distribution, assuming you have access to identically distributed (and, most of the time, assumed to be independent) r.v.s/samples from it, $X_1,X_2,\dots$. You basically want to come up with a nice function $f$ or sequence of functions $(f_n)$ such that either
- $f(X_1,\dots, X_k)$ is with high probability close to $\theta$, or
- $f_n(X_1,\dots,X_n)$ converges (when $n$ grows) towards $\theta$, in some specific sense.
Here, requiring an unbiased estimate you also want $\mathbb{E}[f_n(X_1,\dots, X_n)] = \theta$ (for all $n$), when the probability is over the realizations of $X_1,X_2,\dots$. Basically, your estimator is also a random variable, which depends on the $X_i$'s (being a function of them); and you can as such compute its expectation, variance, etc.
Now, for the case of estimating the expected value, things can seem a bit circular, since your estimator is basically a linear function of the $X_i$'s, and computing its expected value amounts to compute their expected value, which is what you are trying to estimate. But it does make sense nonetheless (it's not because it looks to simple to be true that it's not true); for instance, one could come up with other estimators for the expected value, say for instance
$
Y = \ln \frac{e^{X_1}+e^{X_2}+\dots+e^{X_n}}{n}.
$
Computing the expected value of this new random variable will not be as straightforward; yet eventually it will also boil down to using the fact that the expectation of each $X_i$ is $\mu$ (as well, maybe, as other assumptions like independence of the $X_i$'s).
Best Answer
It is unbiased if $\mathsf{E}(\hat{\sigma}^2)=\sigma^2$.
$$ \begin{align} \mathsf{E}(\hat{\sigma}^2)=\mathsf{E}((X_1-X_2)^2)&=\mathsf{E}(X_1^2)+\mathsf{E}(X_2^2)-2\mathsf{E}(X_1X_2)\\ &=2(\sigma^2+\mu^2)-2\mu^2\\ &=2\sigma^2\neq\sigma^2 \end{align}$$