Yes, this is possible and even fairly easy, but additional information is required. Specifically, we have to make an assumption about what the correlation between the observations from each pair are.
The effect size as a difference in standard deviation units is usually referred to as $d$. We can apply a correction factor to $d$ to incorporate the information about the aforementioned correlation, and then we can use our standard power formulae with this corrected $d$ (making sure to also mind the change in degrees of freedom associated with moving to the paired design) to compute power. The corrected $d$ is
$$
d_o = \frac{d}{\sqrt{1-r}},
$$
where $r$ is the correlation. I have called this $d_o$ because this is sometimes referred to as the "operative effect size."
Here is a little R
routine that computes a table of minimum number of PAIRS as a function of the assumed correlation and the desired power level, with $d=2$ assumed.
library(pwr) # package for pwr.t.test() function
# may need to install first with install.packages()
# define a function to get the minimum number of pairs
# for a given correlation and desired power level
getN <- function(r,p){
unlist(mapply(pwr.t.test, d=2/sqrt(1-r), power=p,
MoreArgs=list(n=NULL, sig.level=.05, type="paired"))["n",])
}
# apply this function to all combinations of the parameters below
tab <- outer(seq(0,.95,.05), c(.7,.8,.9,.95,.99,.999), "getN")
dimnames(tab) <- list("Correlation"=seq(0,.95,.05),
"DesiredPower"=c(.7,.8,.9,.95,.99,.999))
tab
Which returns the following:
DesiredPower
Correlation 0.7 0.8 0.9 0.95 0.99 0.999
0 3.767546 4.220731 4.912411 5.544223 6.888820 8.656788
0.05 3.691858 4.126240 4.787326 5.389850 6.669683 8.350091
0.1 3.615930 4.031562 4.662220 5.235637 6.451021 8.044096
0.15 3.539645 3.936653 4.537050 5.081483 6.232774 7.738792
0.2 3.462940 3.841433 4.411750 4.927417 6.014903 7.434270
0.25 3.385708 3.745774 4.286234 4.773338 5.797404 7.130529
0.3 3.307922 3.649640 4.160447 4.619143 5.580267 6.827580
0.35 3.229382 3.552889 4.034209 4.464751 5.363362 6.525430
0.4 3.149970 3.455310 3.907393 4.309986 5.146613 6.224026
0.45 3.069435 3.356743 3.779777 4.154653 4.929824 5.923282
0.5 2.987581 3.256903 3.651065 3.998456 4.712773 5.623032
0.55 2.904079 3.155423 3.520841 3.841020 4.495111 5.323066
0.6 2.818472 3.051834 3.388672 3.681805 4.276260 5.022875
0.65 2.730145 2.945449 3.253781 3.520048 4.055501 4.721751
0.7 2.638237 2.835369 3.115118 3.354639 3.831565 4.418560
0.75 2.541442 2.720152 2.971074 3.183823 3.602697 4.111397
0.8 2.437713 2.597460 2.819127 3.004879 3.365682 3.796879
0.85 2.323340 2.463226 2.654597 2.812710 3.114890 3.468745
0.9 2.190677 2.309002 2.467897 2.596901 2.838233 3.113596
0.95 2.018024 2.110699 2.231866 2.327720 2.501567 2.692358
Note that $d=2$ is considered in many fields quite a large effect size, so the resulting minimum numbers of pairs are all quite low.
To answer my own question (community wiki, feel free to adjust), after giving it some more thought and researching a bit:
Q: Is there a difference in the way the test statistics are calculated?
A: No. I suppose @Glen_b was hinting in that direction. T-statistic remains the same, as do the DF.
Q: Is the latter approach wrong?
A: No, the former is wrong. As can be read here (http://www.biostathandbook.com/pairedttest.html): The paired t–test assumes that the differences between pairs are normally distributed. Performing a t-test without checking the normality of the difference is wrong, regardless of the normality of X or Z (the constituents of the difference).
Q: In the proposed should situation should I always favour option (1) over option (2)?
A: Either options work but normality of the difference needs to be verified in either case. You might as well do the one-sample t-test because you need to calculate the difference anyhow.
Best Answer
It would be bad -- very bad. If you ignore the pairing and use a two-sample t-test where you should have used the paired, chances are that you will not be able to detect the effect of interest. In this case, between subject variation is included in the estimate of variance used to measure the effect of interest. The variance is inflated, and only a substantial effect size will be deemed significant. Example: students are given a math test, followed by an instruction module, and then another test. If you ignore the pairing (before and after for each student), difference in ability is added to the effect of the module, thus making the effect harder to detect.
On the other hand, if you foist pairing onto an unpaired sample you reduce the degrees of freedom of the test. You are adding variables to the model (the pairing effect) that are basically random. In a sense, for each pair, you are adding a parameter for that pair -- but these parameters mean nothing. You would be adding noise, in a sense.