The O-U process with a delta initial condition is not stationary in this sense. But that's the wrong initial condition.
The O-U process is a Markov process which admits a stationary distribution, so if you want a stationary process, you should start it in the stationary distribution. Here it's a Gaussian distribution, and it isn't hard to work out what the mean and variance of that distribution ought to be.
This corresponds to a time-independent solution of the Fokker-Planck equation, which you can easily verify is of the form
$p(v) = e^{-x^2/2 \sigma^2}$, and you can work out the right value for $\sigma$ in terms of your parameters.
Somewhat embarrassingly I have never familiarized myself with the "forward/backward equation" nomenclature so I won't comment on that directly here. Instead, observe that if $(X_{t})_{t \geq 0}$ satisfies
\begin{equation*}
dX_{t} = - \alpha X_{t} dt + dB_{t},
\end{equation*}
then, given a nice function $f : \mathbb{R} \to \mathbb{R}$, the process $(U_{t})_{t \geq 0} = (f(X_{t})_{t \geq 0}$ satisfies
\begin{equation*}
dU_{t} = \frac{1}{2} f'(X_{t}) dB_{t} + \left(\frac{1}{2} f''(X_{t}) - \alpha X_{t} f'(X_{t})\right) dt.
\end{equation*}
Thus, we deduce that the function $u(x,t) = \mathbb{E}^{x}(f(X_{t})$ satisfies
\begin{equation*}
\frac{\partial u}{\partial t} = \frac{1}{2} \frac{\partial^{2}u}{\partial x^{2}} - \alpha x \frac{\partial u}{\partial x}.
\end{equation*}
Note that the adjoint of the generator $L = \frac{1}{2} \frac{\partial^{2}}{\partial x^{2}} - \alpha x \frac{\partial}{\partial x}$ (acting on $L^{2}(\mathbb{R},dx)$) is given by $L^{*}v = \frac{1}{2} \frac{\partial^{2}v}{\partial x^{2}} + \frac{\partial}{\partial x}(\alpha x v)$, as can be verified using integration by parts.
Now if $p$ is the transition kernel, then we know that $u(x,t) = \int_{\mathbb{R}} f(y) p(t,x,y) \, dy$. Therefore, one way to find $p$ is to solve for $u$ using the Fourier transform, which will give an integral for $u$ in terms of $f$. Also note that the equation solved by $u$ and the integral representation formula implies $\frac{\partial}{\partial t} p(t,x,y) = \frac{1}{2} \frac{\partial^{2}}{\partial x^{2}} p(t,x,y) - \alpha x \frac{\partial}{\partial x} p(t,x,y)$.
Alternatively, suppose $\varphi$ is a nice function adn $v$ solves the adjoint equation
\begin{equation*}
\left\{ \begin{array}{r l}
- \frac{\partial v}{\partial t} = L^{*}v & \text{in} \, \, (0,T) \times \mathbb{R}^{d} \\
v(x,T) = \varphi(x)
\end{array} \right.
\end{equation*}
Observe that
\begin{align*}
\frac{d}{dt} \left\{ \int_{\mathbb{R}} u(x,s) v(x,s) \, dx \right\} &= \int_{\mathbb{R}} \left(\frac{\partial u}{\partial t} v + u \frac{\partial v}{\partial t} \right) \, dx \\
&= 0.
\end{align*}
and, thus,
\begin{equation*}
\int_{\mathbb{R}} u(x,T) \varphi(x) \, dx = \int_{\mathbb{R}} f(x) v(x,0) \, dx.
\end{equation*}
This implies
\begin{equation*}
\int_{\mathbb{R}}f(x) v(x,0) \, dx = \int_{\mathbb{R}} \int_{\mathbb{R}} f(y) p(T,x,y) \varphi(x) \, dx dy.
\end{equation*}
Since $f$ was arbitrary, we deduce $v(x,0) = \int_{\mathbb{R}} \varphi(x) p(T,x,y) \, dy$. Since there's nothing special about $T$, we see that $v(x,s) = \int_{\mathbb{R}} \varphi(x) p(T - s, x,y) \, dy$. Since this is true independently of the choice of $\varphi$, we see that $p(t,x,y)$ solves
\begin{equation*}
-\frac{\partial}{\partial t} p(t,x,y) = \frac{1}{2} \frac{\partial^{2}}{\partial y^{2}} p(t,x,y) + \frac{\partial}{\partial y}\left\{\alpha x p(t,x,y)\right\}.
\end{equation*}
Long story short, the equation you were solving for $p$ was off by a minus sign. If you wanted, you could have used the easier-to-remember equation with $L$ instead of the equation with $L^{*}$.
Best Answer
This link solves the first part of the question http://www.math.ku.dk/~susanne/StatDiff/Overheads1b.pdf