Solved – Generating values from copula using copula package in R

copulafitcopulamvdcrrcopula

I have a bunch of questions concerning the use of the copula package in R. My overall aim is to generate synthetic values using copulas. I am analyzing a hydrological data: annual peak discharge [m³/s] and corresponding volume [m³].

I managed to apply tests on serial independence and dependence. Furthermore I identified and excluded ties and created pseudo-observations (transformation of copula values between [0,1]). Since I don’t know which copula is the best, I fitted the copula parameter first:

fg <- fitCopula(copula=gumbelCopula(), data=u) #  u is my data

I will do this for all available copulas in R. Afterwards I test the goodness of fit with the following function:

gofCopula(copula=gumbelCopula(fg@estimate), x=data[,2:3], N=1000, method="Sn", 
          estim.method="mpl", simulation="mult")

Using the “best” copula, I then want to create synthetic values. I found a function to create random samples, but I am not sure, if it does what I need.

random_samples <- rCopula(copula=gumbelCopula(fg@estimate), n=10000)

It seems to me that this function creates only random values, but is the dependency structure of my data set considered? There is also another function in the copula package mvdc, for the construction of multivariate distributions from copulas. What is actually the difference of mvdc and rCopula, both are generating synthetic values, aren’t they?

One last question is: Once I am able to generate my synthetic values, how can I transform them back to their real units? From reading through the documentation I understood tat I have to multiply the values for (u,v) with the inverse of their particular cdf, is this true?

One question is not answered yet, I want to render it more precisely: namely the function mvdc. According to the copula manual p. 107, it is used to "construct multivariate distributions from copulas"

For the function a copula family, as well as the distributions of the margins have to be specified, for instance:

mv.NE <- mvdc(copula=gumbelCopula(fg@estimate),margins=c("norm","norm"),paramMargins=list(list(mean=0, sd = 1),list(mean=0, sd = 1)))

(here is chose a gumbelCopula and estimated the parameter with fitcopula. I assume that my marginals are both distributed "normal".)

What for do I need this function? I am slightly confused because of:

when I create random values from my copula using rcopula I eventually yield "distributed" values when I pass the values of u and v respectively over to their particular distribution function (in this case both are distributed normal)
so why is there a second option to create multivariate distributions from a copula.

I just don't get the difference…

Best Answer

Thanks for your answers. Indeed, when I plot the different Copulas, it seems (based on a visual comparison) that the dependence struture is depicted in the copula.

Comparison of different results

Then my next step will be to compare the different copula families in terms of looking at their goodness of fit.

@Ben: Ok, so if the distribution of one of my marginals is the generalized extreme value distribution, my u or v will be passed to the "x" of this function, since the other parameters of this distribution are fitted before. forumla_gev

Best Answer

Related Solutions

Solved – Estimating joint distributions using copula package in R

Solved – The importance of the Gaussian copula

Related Question