Assume for simplicity that your model is defined by only one parameter $\theta$. The power is the function $\theta \mapsto \Pr(\text{reject } H_0 \mid \theta)$, which depends on the sample size $n$.
In Retrospective Power Analysis, you only plug in your estimate $\theta$: you look at the value $\Pr(\text{reject } H_0 \mid \hat\theta)$ at the power function at $\theta=\hat\theta$, with the same sample size $n$. It answers the question: "what would be the probability that I would obtain significant results if $\theta$ were $\hat\theta$" ? As said in your text this question is rather useless because there is a one-to-one correspondance between the $p$-value and the restrospective power $\Pr(\text{reject } H_0 \mid \hat\theta)$.
For instance consider a binomial experiment with proportion parameter $\theta \in [0,1]$ and the hypothesis $H_0\colon\{\theta=0\}$. Obviously the power increases when $\theta$ increases. And obviously the $p$-value decreases when $\hat\theta$ increases. Consequently the lower $p$-value, the higher RP (retrospective power). A couple of years ago I wrote a R code for the case of Fisher tests in classical Gaussian linear models. It is here. There's a code using simulations for the one-way ANOVA example and a code for the general model provding an exact calculation of RP in function of the $p$-value and the design parameters. I called my function PAP()
because "Puissance a posteriori" is the French translation of RP and PAP is also an acronym for "Power Approach Paradox". The cause of the decreasing correspondence between $p$ and RP for Gaussian linear models is intuitively the same as for the binomial experiment: if $\theta$ is "far from $H_0$" then the power at $\theta$ is high, and if $\hat\theta$ is "far from $H_0$" then the $p$-value is small. Theoretically this is a consequence of the fact that the noncentral Fisher distributions are stochastically increasing in the noncentrality parameter (see this discussion about noncentral $F$ distributions in Gaussian linear models). In fact here the noncentrality parameter plays the role of $\theta$ (is it the so-called effect size ? I don't know).
I claimed "RS is rather useless because of the correspondence with $p$" because this decreasing correspondence with $p$ means that having a high RP is equivalent to having a small $p$, and vice-versa. But the more serious problem is the misinterpretation of RP; for instance, I have found such claims in the literature:
$H_0$ is not rejected and RP is high, so the decision of the test is significant.
$H_0$ is not rejected, it is not surprising because RP is low.
$H_0$ is rejected (so the decision is significant) and RP is high, so the decision is even more significant.
Respectively replace "RP is high" and "RP is low" with "$p$ is low" and "$p$ is high" in the three claims above and you will see that they are either useless, wrong, or puzzling.
From a more "philosophical" perspective, RP is useless because why would we mind about the probability that rejection of $H_0$ occurs once the experiment is done ?
See also here a funny but clever retrospective power online calculator ;-)
The paragraph A Posteriori Power Analysis says nothing about the choice of $\theta$, but it emphasizes the main difference with the retrospective power: here the goal is to use the information issued from your first experiment to evaluate the power of a future experiment, focusing on the sample size. A sensible approach to evaluate this power is to consider your estimate $\hat\theta$ as a "guess" of the true $\theta$ and also to consider the uncertainty about this estimate. There is a natural way to do so in Bayesian statistics, namely the predictive power, which consists to average the possible values of $\Pr(\text{reject } H_0 \mid \theta)$ for various values of $\theta$, according to some distribution (the posterior distribution in Bayesian terms) representing the knowledge and the uncertainty about $\theta$ resulting from your first experiment. In the frequentist framework you could consider the values of the power evaluated at the bounds of your confidence interval about $\theta$.
Because there will be many more error degrees of freedom, you should see an increase in the $A$ vs $B$ rejections as well as $A$ or $B$ vs $C_i$ rejections, because observed differences of a given number of standard errors in size are much less likely to be due to noise in measuring the standard deviation.
For example, imagine that the common error variance, $\sigma^2=1$.
Then the distribution of the estimate of $\sigma^2$ is quite skewed (and spread out) when there's just $A$ and $B$, but as you add more $C$ groups you get a very much stronger idea of the variance, and this will on average improve your ability to tell A and B apart:
(This assumes half the groups have 2 observations and half have 3 observations)
That bulge in the left tail of the green density below 1 means you get large F's when $H_0$ is true quite often (because you're dividing by a small number more often). As a result, you need a big F to be confident that it's not just random variation.
That's why the 5% critical value for an F(2,3) (i.e. the A vs B alone comparison) is 9.55, while that for an F(2,150) (i.e. only considering A vs B with 98 "C" groups helping to determine $\sigma^2$) is 3.06.
That effect is part of why you don't need many observations per group.
You should further note that if the $C$ groups have population mean intermediate between the $A$ and $B$ groups, then you should reject the null because of B-C and A-C differences. You seem to think that shouldn't happen. That's simply untrue. It ought to happen (though much less often for any particular $A-C_i$ or $B-C_i$ than for $A-B$).
Simulation is a useful tool to see which rejections occur more often as you add groups.
I imagine that with many groups and only a few observations per group, A vs B rejections will eventually become a relatively small proportion of the total rejections, but it's only $C_j$ vs $C_k$ rejections that are incorrect decisions.
Best Answer
Post hoc power analyses are at best useless and are often misleading, read "The Abuse of Power" (Hoenig and Heisey, American statistician, vol 55, issue 1, 2001) for more details.
What might be more useful is confidence intervals on your measures, they can tell you if your original ideas could still be meaningful and if the plausible differences are large enough to care about. These will be much more meaningful than a transformation of the p-value (which is all that the post hoc power is) that is usually interpreted wrongly.