Solved – loc parameter in GPD distribution in POT package for R

extreme valuer

I fitted the Generalized Pareto distribution (GPD) using the POT package in R.

The fitted object provides shape and scale parameters, but not the location parameter. I need it to evaluate the quantile function with qgpd(p, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, lambda = 0).

I am sure this is very basic question (maybe it is mean value, but from which values – all, or only that which are bigger than threshold?) but I am begginer in this area and help is highly appreciated.

Best Answer

The location parameter is the threshold parameter that you provide to the fitgpd function. For example

library(POT)
set.seed(345)
x   <- rgpd(2000, 1, 2, 0.25)
fit <- fitgpd(x, threshold = 1)
fit$fitted.values
#    scale     shape 
#2.0501190 0.2471565

See equations (2.3) and (2.4) in the package vignette for details on the model.

Related Solutions

Solved – Different quantiles of a fitted GPD in different R packages

You could try also evd or Renext packages.

As you noted, all packages functions returned quite similar parameters, so all fitted models embed the same quantile function and the problems come from differing time scales.

For POT models, I believe useful to think of an event rate, say $\lambda$, which is estimated as well as the model for the levels $X_i$ or for excesses $Y_i = X_i-u$, where $u$ is the threshold. The estimated rate $\lambda$ is the number of levels $X_i$ with $X_i>u$ per unit of duration. If you want to use years to compute return levels (RL), you should consider that $\lambda$ expresses in event by year (or inverse year). The return level for period $T$ is then given by $$ x(T) = q_X(1 - 1/\lambda/T) = u + q_Y(1 - 1/\lambda/T) $$ where $q_X$, $q_Y$ are the quantile functions. This formula can be used for $T > 1/\lambda$.

The rate $\lambda$ can be estimated by $\widehat{\lambda} = n_Y /w$ where $w$ is the duration of the fit period and $n_Y$ is the number of exceedances, here $48$. Curiously enough, most R packages do not require $w$ or even allow the user to give it. They rather use a convention which must be well understood by carefully reading the documentation and the examples. For extremes::fevd, the default values of the formals will correspond to $w = 96/365$ years, so $\lambda =182.5$ evt/year in your example. With ismev::gpd.fit you can specify $w$ using a value $n_X / w$ for the npy formal. Similarly evd::fpot has a npp argument for that.

Note that the parameters for the GP excesses do not depend on (``are orthogonal to'') $\lambda$, and their estimation is not impacted by the assumption on $\lambda$ which will however be crucial for RLs. Yet fitted model have an estimation of $\lambda$ used to compute the RLs.

Except for extremes, you computed the return levels yourself, not relying on a prediction function and assuming that $\lambda =1$. This may not be appropriate, because it does not seem that the duration of your fitting period is $48$ years.

## none of this fits uses the good duration?
library(Renext)
## matches (nearly) the computed quantiles labelled 'ismev'
fRen1 <- Renouv(x = potvalues, threshold = 50, effDuration = 96, 
                distname.y = "gpd")
predict(fRen1, newdata = c(2, 5, 10, 20, 50, 100))

##  matches (nearly) the computed quantiles labelled 'evir' and 'pot'
fRen2 <- Renouv(x = potvalues, threshold = 50, effDuration = 48, 
                distname.y = "gpd")
predict(fRen2, newdata = c(2, 5, 10, 20, 50, 100))

## matches (nearly) the computed quantiles labelled 'extremes'
fRen3 <- Renouv(x = potvalues, threshold = 50, effDuration = 96/365,
                distname.y = "gpd")
predict(fRen3, newdata = c(2, 5, 10, 20, 50, 100))

Solved – Cramer Von Mises – How to use this test correctly

A Cramer von Mises test is for a fully specified distribution, not one where you fitted parameters.

When you fit the parameters, the test statistic is nearly always smaller than the one for a prespecified set of parameters. The fitted model will be too close, and your significance level will be far smaller than you intend (and consequently power will also be low).

You can deal with it if you adjust the test for the fitting*, but it's no longer distribution free.

*(e.g. by simulating the distribution of the test statistic under estimation and using that simulated null distribution rather than the tabulated distribution)

Best Answer

Related Solutions

Solved – Different quantiles of a fitted GPD in different R packages

Solved – Cramer Von Mises – How to use this test correctly

Related Question