The "word" description is in fact the guide to a proof. The only problem is showing that the process you describe is in fact optimal. This is done by induction.
we want to prove the formulas you give are the optimal solution.
The case $n=0$ is immediate.
Now: to prove the $k+1$ case assuming the $k$th case (so our induction hypothesis is that the formula you give is optimal for $R_k$ and for $Q_k$), you take a stack of $k+1$ disks. To move them all forward (to perform $Q_{k+1}$) you must get the top $k$ out of the way; that is done, by the induction assumption, with $R_k$ as optimal. Then you move the $k+1$ disk to the target, taking one move; then you move the disks from where they are to the target. Again, by the induction assumption, the best possible way of doing this is $R_k$. So you get that $Q_{k+1}=R_k+1+R_k=2R_k+1$, on the assumption that $Q_k$ and $R_k$ are the optimal moves.
The same argument works for $R_{k+1}$; the only thing that your "just words" is missing is to be clear about what the induction hypothesis is, and to apply it.
Added: Why is the given strategy actually the best? In order to move all the disks from the original rod to the target rod, we must move the largest disk at least once; note that the position of the largest disk does not affect the validity or lack thereof of any move, so the position of the largest disk is immaterial for any other moves (as opposed to any other disk). So the optimal solution will only require us to move the largest disk exactly once, from the original rod to the target rod. In order to do this, we must first remove all disks from above the largest disk, and have no disks on the target rod so we can perform the move; that is $R_k$, because the only configuration that allows us to move the largest disk from the original rod to the target rod is to have all other disks in the non-target rod. Then we move the largest disk, which is the $+1$. And now, we just want to move the remaining disks to their final position in the quickest possible way, which again is $R_k$. A similar argument focusing on placing the largest disk in its final position, works for the computation of $R_{k+1}$.
Using Henry’s notation, the system of recurrences is
$$\left\{\begin{align*}
&m_{AC}(n+1)=2m_{AC}(n)+m_{CA}(n)+2\\
&m_{CA}(n+1)=m_{CB}(n)+m_{BA}(n)+1\\
&m_{AB}(n+1)=m_{AC}(n)+m_{CB}(n)+1\\
&m_{BA}(n+1)=m_{BC}(n)+m_{CA}(n)+1\\
&m_{BC}(n+1)=m_{BA}(n)+m_{AC}(n)+1\\
&m_{CB}(n+1)=m_{CA}(n)+m_{AB}(n)+1\;.
\end{align*}\right.$$
Then $$\begin{align*}
m_{CA}(n+1)&=m_{CB}(n)+m_{BA}(n)+1\\
&=2m_{CA}(n-1)+m_{AB}(n-1)+m_{BC}(n-1)+3\\
&=2m_{CA}(n-1)+2m_{AC}(n-2)+m_{CB}(n-2)+m_{BA}(n-2)+5\\
&=2m_{CA}(n-1)+2m_{AC}(n-2)+m_{CA}(n-1)+4\\
&=3m_{CA}(n-1)+2m_{AC}(n-2)+4\;.
\end{align*}$$
On the other hand, $m_{CA}(n)=m_{AC}(n+1)-2m_{AC}(n)-2$, so $$\begin{align*}
m_{CA}(n+1)&=m_{AC}(n+2)-2m_{AC}(n+1)-2\;,\\
m_{CA}(n-1)&=m_{AC}(n)-2m_{AC}(n-1)-2\;,
\end{align*}$$
and hence $$\begin{align*}
m_{AC}(n+2)-2m_{AC}(n+1)-2&=m_{CA}(n+1)\\
&=3m_{CA}(n-1)+2m_{AC}(n-2)+4\\
&=3m_{AC}(n)-6m_{AC}(n-1)+2m_{AC}(n-2)-2\;,
\end{align*}$$
or $$m_{AC}(n+2)=2m_{AC}(n+1)+3m_{AC}(n)-6m_{AC}(n-1)+2m_{AC}(n-2)\;.$$
Finally, shifting indices gives us $$m_{AC}(n)=2m_{AC}(n-1)+3m_{AC}(n-2)-6m_{AC}(n-3)+2m_{AC}(n-4)\;.$$
By direct computation the initial values are $m_{AC}(0)=0$, $m_{AC}(1)=2$, $m_{AC}(2)=7$, and $m_{AC}(3)=19$. The sequence continues $47, 113, 267, 629$ and is not in OEIS.
The auxiliary equation is $x^4-2x^3-3x^2+6x-2=0$, which reduces to $$(x-1)(x^3-x^2-4x+2)=0\;.$$ The cubic factor has three real roots; I didn’t feel like writing down the exact expressions for them, but this cubic equation calculator says that they are approximately $$\begin{align*}&2.34292308277717\;,\\
-&1.81360650264833\;,\text{ and}\\
&0.47068341987116\;.
\end{align*}$$
In the long run the first will dominate, and the sequence will grow by approximately that factor with each increase of one in the number of disks. For the record, $629/267\approx 2.3558$.
Best Answer
"Every algorithm" is quite a large set of algorithms. If you draw, even for three disks, the graph of possible states and the moves between them, you get a very complicated picture:
Here, I'm representing a state as a string of the location of the first, second, and third disk, so we are trying to get from
aaa
toccc
.An algorithm would just be any method of doing that. If you wanted the algorithm to be memoryless, in the sense that it always does the same thing from any state, then you could get the number of moves to $T_n = 3^n-1$, by taking the most indirect route possible, and visiting every single other state before we get from
aa...a
tocc...c
. If not, then you could make the algorithm take arbitrarily long, for example by following the rule "Move the first disk around $2^{2^n}$ times, then use the ordinary algorithm."In particular, it's not true that every algorithm obeys a recurrence of the form $T_n = 2T_{n-1} + c$. A general algorithm: