You could try also evd or Renext packages.
As you noted, all packages functions returned quite similar parameters, so all
fitted models embed the same quantile function and the problems
come from differing time scales.
For POT models, I believe useful to think of an event rate, say
$\lambda$, which is estimated as well as the model for the levels $X_i$
or for excesses $Y_i = X_i-u$, where $u$ is the threshold. The estimated
rate $\lambda$ is the number of levels $X_i$ with $X_i>u$ per unit of
duration. If you want to use years to compute return levels (RL), you
should consider that $\lambda$ expresses in event by year (or inverse
year). The return level for period $T$ is then given by
$$
x(T) = q_X(1 - 1/\lambda/T) = u + q_Y(1 - 1/\lambda/T)
$$
where $q_X$, $q_Y$ are the quantile functions. This formula
can be used for $T > 1/\lambda$.
The rate $\lambda$ can be estimated by $\widehat{\lambda} = n_Y /w$
where $w$ is the duration of the fit period and $n_Y$ is the number of
exceedances, here $48$. Curiously enough, most R packages do not
require $w$ or even allow the user to give it. They rather use a
convention which must be well understood by carefully reading the
documentation and the examples. For extremes::fevd
, the default
values of the formals will correspond to $w = 96/365$ years, so
$\lambda =182.5$ evt/year in your example. With ismev::gpd.fit
you can specify $w$
using a value $n_X / w$ for the npy
formal. Similarly evd::fpot
has a npp
argument for that.
Note that the parameters for the GP excesses do not depend on (``are
orthogonal to'') $\lambda$, and their estimation is not impacted by the
assumption on $\lambda$ which will however be crucial for RLs. Yet fitted
model have an estimation of $\lambda$ used to compute the RLs.
Except for extremes
, you computed the return levels yourself, not
relying on a prediction function and assuming that $\lambda =1$. This
may not be appropriate, because it does not seem that the duration of
your fitting period is $48$ years.
## none of this fits uses the good duration?
library(Renext)
## matches (nearly) the computed quantiles labelled 'ismev'
fRen1 <- Renouv(x = potvalues, threshold = 50, effDuration = 96,
distname.y = "gpd")
predict(fRen1, newdata = c(2, 5, 10, 20, 50, 100))
## matches (nearly) the computed quantiles labelled 'evir' and 'pot'
fRen2 <- Renouv(x = potvalues, threshold = 50, effDuration = 48,
distname.y = "gpd")
predict(fRen2, newdata = c(2, 5, 10, 20, 50, 100))
## matches (nearly) the computed quantiles labelled 'extremes'
fRen3 <- Renouv(x = potvalues, threshold = 50, effDuration = 96/365,
distname.y = "gpd")
predict(fRen3, newdata = c(2, 5, 10, 20, 50, 100))
A Cramer von Mises test is for a fully specified distribution, not one where you fitted parameters.
When you fit the parameters, the test statistic is nearly always smaller than the one for a prespecified set of parameters. The fitted model will be too close, and your significance level will be far smaller than you intend (and consequently power will also be low).
You can deal with it if you adjust the test for the fitting*, but it's no longer distribution free.
*(e.g. by simulating the distribution of the test statistic under estimation and using that simulated null distribution rather than the tabulated distribution)
Best Answer
The location parameter is the
threshold
parameter that you provide to thefitgpd
function. For exampleSee equations (2.3) and (2.4) in the package vignette for details on the model.