Note that by the Law of Total Expectation,
$f_r(n)=P(T_r=n|X_0=0)=\sum_{k=0}^{\infty} P(T_r=n|T_1=k, X_0=0) *P(T_1=k|X_0=0)$
Now, since this is a Markov process, we can say that
$P(T_r=n|T_1=k, X_0=0) = P(T_r=n|T_1=k)$
because conditional expectations of $T_r$ depend only on the most recent information available, (as you quoted in your definition of the markov. property). And we know that $k>=0$, so this will always "take precedence" over conditioning on $X_0$.
Plugging this in and making substitutions, you get the result.
EDIT: To show that the Processes $T_r$ are Markov, I will make use of the concept of a Filtration. The Markov property can alternatively be stated as:
Let $X_t$ be a stochastic process with the Strong Markov Property. Then for all $s \leq t$,
$P(X_t|\mathcal{F}_s) = P(X_t|X_s)$
Where $\mathcal{F}_s$ represents a filtration.
The formal definition of a filtration is an increasing set of sigma algrebras on the probability space in question. But what it really means is the set of all information known up to and including the time s. This means that conditioning on a set of previous events is the same as conditioning on the most recent event in that set.
Now, why is it that $X_n$ being strong Markov implies that $T_n$ is also strong Markov? The intuitive explanation is that if you know $T_r=n$ for some state r and time n, then you also know $X_n$. So knowing $X_m$ or anything else about the process for some $m<n$ does not add any new information.
We can prove this formally using the reflection principle. For this argument, I will assume that we are condidering first arrival as points greater than the starting point. The reflection principle in this context states that
$P(T_r=n |\mathcal{F}_t) =P( sup_{s \leq n-1}X_{s} < r \wedge X_n = r|\mathcal{F}_t)= P( X_{n} < r -2|\mathcal{F}_t)$
Now, since the last expression here is simply a statement about $X_n$, and the process $X_n$ is Markov, we know that $P( X_n < r|\mathcal{F}_t)=P( X_n < r|X_t = r^*)$ and thus $P(T_r=n |\mathcal{F}_t)=P(T_r=n |X_t=r^*)$.
In other words, Conditioning $T_r$ on a set of previous information is the same as conditioning on the most recent position of the process. Note that if $T_r$ is actually known (i.e. if $T_r \in \mathcal{F}_t$), this argument is trivially true. Finally, we can say that
$P(T_r=n |T_{r^*}=t) = P(T_r=n |X_t=r^* \wedge X_s<r^* \forall s<t)=P(T_r=n |X_t=r^*)$. This is just a special case of what I just proved, which in english says that the knowledge that $X_t$ was a first arrival is irrelevant.
Altogether, we have
$P(T_r=n |\mathcal{F}_t)=P(T_r=n |T_{r^*}=t)$, for all $r^*$ i.e. the process is Markov. Or more precisely, the set of processes are jointly Markov.
(A) Find the Stationary Distribution
First, let's make some observations about this DTMC. Firstly, if we draw this out, we can immediately see that it has period 2, and therefore will have no limiting distribution. Due to this, we can't assume that there exists a stationary distribution. Noting this, we also can't do any fancy numerical tricks such as $P^{10000}$ to find the stationary distribution, so we will need to find these the old fashioned way.
Nothing the method whuber states, let's start with small $k$ and look for patterns. Starting with $k = 1$, the stationary distribution is trivial
$$\pi = \left[ \frac{1}{2} \quad \frac{1}{2}\right]$$
Moving on to $k = 2$, we can solve $\pi P = \pi$ by reducing the following matrix, also considering the additional qualification that $\sum_{i=1}^n \pi_i = 1$,
$$P = \begin{bmatrix}
-1 & .5 & 0 & 0\\
1 & -1 & 1 & 0\\
1 & 1 & 1 & 1
\end{bmatrix} \implies \pi = \left[\frac{1}{4} \quad \frac{1}{2} \quad \frac{1}{4}\right]$$
Moving on to $k = 3$, solving a similar matrix leads to
$$\pi = \left[\frac{1}{6} \quad \frac{1}{3} \quad \frac{1}{3} \quad \frac{1}{6}\right]$$
and $k = 4$ gives
$$\pi = \left[\frac{1}{8} \quad \frac{1}{4} \quad \frac{1}{4} \quad \frac{1}{4} \quad \frac{1}{8} \right]$$
At this point, a pattern is clearly formed. The stationary distribution can be summarized by
$$\pi = \begin{cases}
\frac{1}{2k}, & \text{for }k = 0, n\\
\frac{1}{k}, & \text{for }0 < k < n
\end{cases}$$
This can be tested with the time reversible equations where we must show
$$\pi_i p_{i,j} = \pi_j p_{j,i} \quad \forall \quad i,j$$
The intermediate probabilities are trivial, so we will show specifically that $\pi_0 p_{0,1} = \pi_1 p_{1,0}$ as it is clearly equal to solving $\pi_{n-1} p_{n-1,n} = \pi_n p_{n,n-1}$.
\begin{align*}
\pi_0 p_{0,1} &= \frac{1}{2k} * 1\\
&= \frac{1}{2} * \frac{1}{k}\\
&= \frac{1}{k} * \frac{1}{2}\\
&= \pi_1 p_{1,0}
\end{align*}
and as this solution satisfies our time reversibly equations, it is indeed the stationary distribution.
(B) Expected Walks to return to $P_0$
If you haven't read it yet, understand that the expected time between visits is a geometric problem, and it reduces simply to $$E[\text{Return to }P_0] = \dfrac{1}{\pi_0} = \dfrac{1}{\frac{1}{2000}} = 2000$$
Let me know if that made sense or if you have any questions.
Best Answer
Yes, you can simulate it using a big transition matrix and roll it, while observing the cumulative probability of jumping back to the corner, an absorbing state. This is a Markov traverse on a graph with 64 nodes and a lot of sides, each representing allowed move of a knight on a board.
The average is suprisingly high: 168. This many steps in average is the length of the cycle in this graph when you start in the corner.
Here's the solution in Python: