We need to take some care with the notation because the models differ.
Let the first (correct) model be
$$Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \varepsilon\tag{1}$$
where the $\varepsilon_i$ have a common variance and zero means; and write the second model (which governs the very same variables $Y$, so no need to change their name) as
$$Y = \alpha_0 + \alpha_1 X_1 + \delta.\tag{2}$$
As an aside, we may impose no additional assumptions on $\delta$ because these random variables are completely determined by equating the two right hand sides (which, after all, equal the same things):
$$\delta = (\beta_0 - \alpha_0) + (\beta_1 - \alpha_1)X_1 + \beta_2 X_2 + \varepsilon.$$
(From now on I will drop the generic discussion of models to focus on a dataset with explanatory values $x_{1i}$ and $x_{2i},$ responses $y_i,$ and associated error $\varepsilon_i$ and $\delta_i.$)
We can infer, however, that the $\delta_i$ all have the same variances as the $\varepsilon$ and their means are
$$E[\delta_i] = (\beta_0 - \alpha_0) + (\beta_1 - \alpha_1)x_{1i} + \beta_2 x_{2i},$$
which may vary among observations.
Let's return to the analysis.
Fitting the second model gives the slope estimate
$$\hat\alpha_1 = \frac{\sum_{i} (y_i - \bar y)(x_{1i} - \bar{x}_1)}{\sum_{i} (x_{1i} - \bar{x}_1)^2}.\tag{*}$$
This is a linear combination of the $y_i-\bar y,$ so use the zero-mean assumption about the $\varepsilon_i$ to compute
$$E[y_i - \bar y] = (\beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i}) -(\beta_0 + \beta_1 \bar{x}_1 + \beta_2 \bar{x}_2) = \beta_1(x_{1i}-\bar{x}_i) + \beta_2(x_{2i} - \bar{x}_2)$$
and apply linearity of expectation in $(*)$ to compute
$$E[\hat\alpha_1] = \beta_1 + \beta_2\frac{\sum_{i} (x_{2i}-\bar{x}_2)(x_{1i} - \bar{x}_1)}{\sum_{i} (x_{1i} - \bar{x}_1)^2}.$$
Equating this with $\beta_1$ to assess the bias in using $\hat\alpha_1$ to estimate $\beta_1,$ we find it will be unbiased if and only if the second term is zero. This can happen in two ways:
If $\beta_2 = 0.$ (This just means the second model is correct.)
If $\sum_{i} (x_{2i}-\bar{x}_2)(x_{1i} - \bar{x}_1)=0.$ This means the covariance of the $x_1$ data and the $x_2$ data is zero: that is, the design vectors are orthogonal.
If neither of these is the case, the bias is nonzero. That agrees exactly with your intuition.
Well, except in the multivariate normal case, zero covariance does not imply independence. You have not specified any distributions, so we cannot assume multivariate normal distributions. So technically, as stated, we cannot conclude that $\hat\beta_2$ is unbiased. However, if there is a way to show that $x_2$ and $\epsilon$ are independent, I think you're in business. From the causal diagram perspective, you would likely model independence as the absence of any backdoor path from $x_2$ to $y,$ which means that your model would produce an unbiased $\hat\beta_2.$ Incidentally, a non-zero covariance doesn't necessarily mean, even, that you have a problem with $x_1.$ If $x_1$ causally influences $\epsilon,$ then $\epsilon$ is merely a mediator and there is no confounding. However, if $\epsilon$ influences $x_1,$ then you have the backdoor path $x_1\leftarrow\epsilon\rightarrow y,$ and finding the causal effect of $x_1$ on $y$ becomes more difficult - though I realize you're actually asking about $x_2.$
Best Answer
The Gauss-Markov theorem states that the covariance matrix of any unbiased estimator $\tilde{\beta} \ne \hat{\beta}_{OLS}$ exceeds that of $\hat{\beta}_{OLS}$ by a positive semidefinite matrix. Let's label the OLS covariance matrix $\Omega$ and the positive semidefinite matrix $D$. The variance of the sum of the OLS estimates can be written as $\iota' \Omega \iota$, where $\iota$ is a vector of ones of the appropriate length. For a non-OLS estimator, the variance of the sum is:
$\sigma^2_\Sigma = \iota' (\Omega + D) \iota = \iota' \Omega \iota + \iota' D \iota \geq \iota' \Omega \iota$
as $D$ is positive semidefinite. Therefore, the sum of the OLS parameter estimates is the minimum variance unbiased estimator of the true sum of the parameters.
In fact, this applies to any weighted sum of the parameter estimates, not just the unweighted sum.