There can be no single state-of-the-art for goodness of fit (for example no UMP test across general alternatives will exist, and really nothing even comes close -- even highly regarded omnibus tests have terrible power in some situations).
In general when selecting a test statistic you choose the kinds of deviation that it's most important to detect and use a test statistic that is good at that job. Some tests do very well at a wide variety of interesting alternatives, making them decent default choices, but that doesn't make them "state of the art".
The Anderson Darling is still very popular, and with good reason. The Cramer-von Mises test is much less used these days (to my surprise because it's usually better than the Kolmogorov-Smirnov, but simpler than the Anderson-Darling -- and often has better power than it on differences "in the middle" of the distribution)
All of these tests suffer from bias against some kinds of alternatives, and it's easy to find cases where the Anderson-Darling does much worse (terribly, really) than the other tests. (As I suggest, it's more 'horses for courses' than one test to rule them all). There's often little consideration given to this issue (what's best at picking up the deviations that matter the most to me?), unfortunately.
You may find some value in some of these posts:
Is Shapiro–Wilk the best normality test? Why might it be better than other tests like Anderson-Darling?
2 Sample Kolmogorov-Smirnov vs. Anderson-Darling vs Cramer-von-Mises (about two-sample tests but many of the statements carry over
Motivation for Kolmogorov distance between distributions (more theoretical discussion but there are several important points about practical implications)
I don't think you'll be able to form a confidence interval for the cdf in the Cramer-von Mises and Anderson Darline statistics, because the criteria are based on all of the deviations rather than just the largest.
In Mathematica this works:
GPD = ParetoPickandsDistribution[2, 3, .07];
data = RandomVariate[GPD, 10^4];
FindDistributionParameters[data, ParetoPickandsDistribution[mu, sigma, eta]] ->
{mu -> 2.00036, sigma -> 2.96883, eta -> 0.07022}
where mu is the location parameter, sigma the scale parameter, and eta the shape parameter.
FindDistributionParameters can use 5 different methods (see the documentation), but I believe the default is maximum likelihood estimation (MLE). Mathematica has all the tools (Likelihood, LogLikelihood, FindMaximium, Maximize, and ParetoPickandsDistribution for the PDF) to do MLE from scratch, if that's your wont. There is a good explanation of MLE in Wikipedia.
Best Answer
A Cramer von Mises test is for a fully specified distribution, not one where you fitted parameters.
When you fit the parameters, the test statistic is nearly always smaller than the one for a prespecified set of parameters. The fitted model will be too close, and your significance level will be far smaller than you intend (and consequently power will also be low).
You can deal with it if you adjust the test for the fitting*, but it's no longer distribution free.
*(e.g. by simulating the distribution of the test statistic under estimation and using that simulated null distribution rather than the tabulated distribution)