Your discrete stochastic process is defined as:
\begin{equation}
x_t = B_1 + B_2t + w_t~~~~~~~, ~~ w_t \sim WN(0,\sigma^2)
\end{equation}
Clearly it is not stationary since:
$$E[x_t] = B_1+B_2t$$
Now we consider the differentiated process of $x_t$, using the lag operator ($LY_t=Y_{t-1}$):
$$ \Delta Y_t = (1-L)Y_t = Y_t - Y_{t-1} $$
$$ = B_1 + B_2t + w_t - (B_1 + B_2(t-1) + w_{t-1})$$
$$ = B_1 + B_2t + w_t - B_1 - B_2t + B_2 - w_{t-1} $$
$$ \Delta Y_t = B_2 + w_t - w_{t-1} $$
Now it is clearly stationary since we have:
$$ E[\Delta Y_t] = B_2~~,~~VAR[\Delta Y_t]=2\sigma^2 $$
and the covariance depends on time lag only.
You are making the problem much harder than needed. A necessary condition that the process ${X_t}$ be stationary is that $X_t$ must have the same distribution as $X_{t-1}$. In particular, $X_t$ must have the same mean and variance as $X_{t-1}$. Now, in order for $E[X_t] = E[X_{t-1}]$ to hold, it must be that $$E[X_t] = aE[X_{t-1}] + E[Z_t] = aE[X_{t-1}] \implies
E[X_t] = 0 ~\forall t.$$
Similarly, in order for
$\operatorname{var}(X_t) = \operatorname{var}(X_{t-1})$ to hold, it must be that
$$\operatorname{var}(X_t) = a^2 \operatorname{var}(X_{t-1})
+ \operatorname{var}(Z_t) = a^2 \operatorname{var}(X_t) + \sigma^2
\implies \operatorname{var}(X_t) = \frac{\sigma^2}{1-a^2}.$$
Now, the distribution of $X_t$ is the convolution of the distributions of $aX_{t-1}$ and $Z_t$, and if $X_{t-1}$ and $Z_t$ are (independent) normal random variables, so is $X_t$ a normal random variable.
With regard to strict stationarity when the $X$'s and $Z$'s are normal random variables, see the last two paragraphs of this answer of mine over on dsp.SE.
Best Answer
What you have called the "covariance of the process" is often called the autocovariance function $C_X(i,j) = \operatorname{cov}(X[i],X[j])$ (to distinguish it from the crosscovariance function $C_{X,Z}(i,j) = \operatorname{cov}(X[i],Z[j])$ which considers the covariance of variables drawn from two different processes, which you will also need in this problem). In this instance, use of the bilinearity property of the covariance operator allows us to expand out $C_X(n,n+m)$, which is just
$$ \operatorname{cov}\left(\sum_{i=0}^1a_iZ[n-i]+ b_iX[n-i-1], \sum_{i=0}^1a_iZ[n+m-i]+ b_iX[n+m-i+1]\right),$$
into a sum of $16$ autocovariance and crosscovariance functions. Is the sum a function of just $m$, the difference between the arguments? You will see that it is not sufficient to assume that $X$ and $Z$ are WSS processes, but that it is also necessary to consider whether $X$ and $Z$ are jointly WSS processes.