This is a computational exercise, so think recursively. The current state of coin flipping is determined by the ordered pair $(N,M)$ with $N\ge M\ge 0$. Let the expected number of flips to reach $N$ consecutive heads be $e(N,M)$:
(1) There is a 50% chance the next flip will be heads, taking you to the state $(N,M+1)$, and a 50% chance the next flip will be tails, taking you to the state $(N,0)$. This costs one flip. Therefore the expectation (recursively) is given by
$$e(N,M) = \frac{1}{2} e(N,M+1) + \frac{1}{2} e(N,0) + 1.$$
(2) Base conditions: you have already stipulated that
$$e(N,0) = 2^{N+1}-2$$
and obviously
$$e(N,N) = 0$$
(no more flips are needed).
Here's the corresponding Mathematica program (including caching of intermediate results to speed up the recursion, which effectively makes it a dynamic programming solution):
e[n_, m_] /; n > m > 0 := e[n, m] = 1 + (e[n, m + 1] + e[n, 0])/2 // Simplify
e[n_, 0] := 2^(n + 1) - 2
e[n_, n_] := 0
The program would look similar in other programming languages that support recursion. Mathematically, we can verify that $e(N,M) = 2^{N+1} - 2^{M+1}$ simply by checking the recursion, because it obviously holds for the base cases:
$$2^{N+1} - 2^{M+1} = 1 + (2^{N+1} - 2^{M+2} + 2^{N+1} - 2)/2,$$
which is true for any $M$ and $N$, QED.
More generally, the same approach will establish that $e(N,M) = \frac{p^{-N} - p^{-M}}{1-p}$ when the coin has probability $p$ of heads. The hard part is working out the base condition $e(N,0)$. That is done by chasing the recursion out $N$ steps until finally $e(N,0)$ is expressed in terms of itself and solving:
$$\eqalign{
e(N,0) &= 1 + p e(N,1) + (1-p) e(n,0) \\
&= 1 + p\left(1 + p e(N,2) + (1-p) e(N,0)\right) + (1-p) e(N,0) \\
\cdots \\
&= 1 + p + p^2 + \cdots + p^{N-1} + (1-p)[1 + p + \cdots + p^{N-1}]e(N,0);\\
e(N,0) &= \frac{1-p^N}{1-p} + (1-p^N)e(N,0); \\
e(N,0) &= \frac{p^{-N}-1}{1-p}.
}$$
Add up the probabilities of the coin coming up heads for the first time on toss 1, 3, 5...
$p_o = 1/2 + 1/2^3 + 1/2^5 + ...$
The $1/2$ term is pretty obvious, it's the probability of the first toss being heads.
The $1/2^3$ term is the probability of getting heads for the first time on the third toss, or the sequence TTH. That sequence has a probability of $1/2 * 1/2 * 1/2$.
The $1/2^5$ term is the probability of getting heads for the first time on the fifth toss, or the sequence TTTTH. That sequence has a probability of $1/2 * 1/2 * 1/2 * 1/2 * 1/2$.
Now we can rewrite the series above as
$p_o = 1/2 + 1/8 + 1/32 + ...$
This is a geometric series that sums to $2/3$. The easiest way to show this is with a visual example. Start with the series
$p = 1/2 + 1/4 + 1/8 + 1/16 + 1/32 + 1/64 + ...$
This is a geometric series that sums to $1$.
If we sum just the even terms of that series, we can see that they sum to $1/3$.
$1/4 + 1/16 + 1/64 + 1/256 + ... = 1/3$
If you eliminate the even terms from the full sequence, you're left with just the odd terms, which must add up to $2/3$.
$p_o = 1/2 + 1/8 + 1/32 + ... = 2/3$
Best Answer
This can be answered using the geometric distribution as follows:
The number of failures k - 1 before the first success (heads) with a probability of success p ("heads") is given by:
$$p(X=k)=(1−p)^{k−1}p$$
with k being the total number of tosses including the first 'heads' that terminates the experiment.
And the expected value of X for a given p is $1/p=2$.
The derivation of the expected value can be found here. The last steps left implicit should be as follows:
$\frac{d}{dr} \frac{1}{1-r} = \frac{1}{(1-r)^2}$ to be plugged into the expression:
$E(X) = \frac{p}{1-p} \sum\limits_{x = 1}^{\infty}x\ r^x = \frac{p}{1-p}\ r\ (\frac{d}{dr} \frac{1}{1-r})= \frac{p}{1-p}\ r\ \frac{1}{(1-r)^2}$. With $r = 1 - p$, it simplifies to
$E(X) = \frac{1}{p}$, justifying its use above.]
Alternatively, we could use the negative binomial distribution interpreted as the number of failures before the first success. The probability mass function is given as the p(number of failures, n, before attaining r successes | given a certain probability, p, of success in each Bernoulli trial):
$$p(n;r,p) ={n+r-1\choose r-1} p^r (1-p)^n$$
The expectation for number of trials, n + r is given by the general formula:
$$\frac{r}{(1-p)}$$
Given our known parameters: r = 1 and p = 0.5,
$$E(n + r; 1,0.5) =\frac{r}{1-p} = \frac{1}{1-0.5} = 2$$
Hence we can expect to make two tosses before getting the first head with the the expected number of tails being $E(n+r) - r = 1$.
We can run a Monte Carlo simulation to prove it: