Solved – Difference between multivariate standard normal distribution and Gaussian copula

copulanormal distribution

I wonder what the difference between multivariate standard normal distribution and Gaussian copula is since when I look at the density function they seem the same to me.

My issue is why the Gaussian copula is introduced or what benefit the Gaussian copula generates or what its superiority is when Gaussian copula is nothing but a multivariate standard normal function itself.

Also what is the concept behind probability integral transformation in copula? I mean we know that a copula is a function with uniform variable. Why does it have to be uniform? Why not use the actual data like multivariate normal distribution and find the correlation matrix? (Normally we plot the two asset returns to consider their relationships but when it is copula, we plot the Us which are probabilities instead.)

Another question. I also doubt whether the correlation matrix from MVN could be non-parametric or semi-parametric like those of copula (for copula parameter can be kendall's tau, etc.)

I would be very thankful for your help since I'm new in this area. (but I have read a lot of papers and these are the only things that I don't understand)

Best Answer

One general rule about technical papers--especially those found on the Web--is that the reliability of any statistical or mathematical definition offered in them varies inversely with the number of unrelated non-statistical subjects mentioned in the paper's title. The page title in the first reference offered (in a comment to the question) is "From Finance to Cosmology: The Copula of Large-Scale Structure." With both "finance" and "cosmology" appearing prominently, we can be pretty sure that this is not a good source of information about copulas!

Let's instead turn to a standard and very accessible textbook, Roger Nelsen's An introduction to copulas (Second Edition, 2006), for the key definitions.

... every copula is a joint distribution function with margins that are uniform on [the closed unit interval $[0,1]]$.

[At p. 23, bottom.]

For some insight into copulae, turn to the first theorem in the book, Sklar's Theorem:

Let $H$ be a joint distribution function with margins $F$ and $G$. Then there exists a copula $C$ such that for all $x,y$ in [the extended real numbers], $$H(x,y) = C(F(x),G(y)).$$

[Stated on pp. 18 and 21.]

Although Nelsen does not call it as such, he does define the Gaussian copula in an example:

... if $\Phi$ denotes the standard (univariate) normal distribution function and $N_\rho$ denotes the standard bivariate normal distribution function (with Pearson's product-moment correlation coefficient $\rho$), then ... $$C(u,v) = \frac{1}{2\pi\sqrt{1-\rho^2}}\int_{-\infty}^{\Phi^{-1}(u)}\int_{-\infty}^{\Phi^{-1}(v)}\exp\left[\frac{-\left(s^2-2\rho s t + t^2\right)}{2\left(1-\rho^2\right)}\right]dsdt$$

[at p. 23, equation 2.3.6]. From the notation it is immediate that this $C$ indeed is the joint distribution for $(u,v)$ when $(\Phi^{-1}(u), \Phi^{-1}(v))$ is bivariate Normal. We may now turn around and construct a new bivariate distribution having any desired (continuous) marginal distributions $F$ and $G$ for which this $C$ is the copula, merely by replacing these occurrences of $\Phi$ by $F$ and $G$: take this particular $C$ in the characterization of copulas above.

So yes, this looks remarkably like the formulas for a bivariate normal distribution, because it is bivariate normal for the transformed variables $(\Phi^{-1}(F(x)),\Phi^{-1}(G(y)))$. Because these transformations will be nonlinear whenever $F$ and $G$ are not already (univariate) Normal CDFs themselves, the resulting distribution is not (in these cases) bivariate normal.

Example

Let $F$ be the distribution function for a Beta$(4,2)$ variable $X$ and $G$ the distribution function for a Gamma$(2)$ variable $Y$. By using the preceding construction we can form the joint distribution $H$ with a Gaussian copula and marginals $F$ and $G$. To depict this distribution, here is a partial plot of its bivariate density on $x$ and $y$ axes:

Plot

The dark areas have low probability density; the light regions have the highest density. All the probability has been squeezed into the region where $0\le x \le 1$ (the support of the Beta distribution) and $0 \le y$ (the support of the Gamma distribution).

The lack of symmetry makes it obviously non-normal (and without normal margins), but it nevertheless has a Gaussian copula by construction. FWIW it has a formula and it's ugly, also obviously not bivariate Normal:

$$\frac{1}{\sqrt{3}}2 \left(20 (1-x) x^3\right) \left(e^{-y} y\right) \exp \left(w(x,y)\right)$$

where $w(x,y)$ is given by $$\text{erfc}^{-1}\left(2 (Q(2,0,y))^2-\frac{2}{3} \left(\sqrt{2} \text{erfc}^{-1}(2 (Q(2,0,y)))-\frac{\text{erfc}^{-1}(2 (I_x(4,2)))}{\sqrt{2}}\right)^2\right).$$

($Q$ is a regularized Gamma function and $I_x$ is a regularized Beta function.)

Related Solutions

Solved – Simulate a Gaussian Copula with t margins

There's an R package called "copula" that will let you do exactly this.

The process goes:

Specify a copula
Specify the population distribution, including whatever marginals you want. From the documentation: "A user-defined distribution, for example, fancy, can be used as margin provided that dfancy, pfancy, and qfancy are available."
Generate samples from that multivariate distribution.

For you, you would specify a Gaussian copula in step 1 and then say that you want t-distributed marginals in step 2.

# Step 1
#
my_copula <- normalCopula(0.8)

# Step 2
#
my_population <- mvdc(my_copula, c("t","t"),list(t=3,t=3))

# Step 3
#
my_sample <- rMvdc(1000,my_population)

Caveat: I don't have access to this package right now, so I can't swear that this will compile, though it gives the gist of what to do.

Solved – The importance of the Gaussian copula

Now my question is, what exactly makes the Gaussian copula so important among all the possible choices of copulas?

Is it? What makes you say it's especially important?

suppose that their joint distribution is not a multivariate normal. Then my understanding is that since marginals are decoupled from the copula, that their joint distribution (being non-normal) cannot have a Gaussian copula.

Correct; if it did have a Gaussian copula it would be multivariate normal.

But will they still be in some sense 'well approximated' by a Gaussian copula?

It depends on the copula they do have, but in general, no.

Furthermore, what if our marginals are not normal, now it seems even less justifiable to use a Gaussian copula,

It depends on why it's being used. It may be a reasonable approximation or it may not.

rather than just choosing the copula that fits the data the best in [0,1]n space.

If you have no particular reason to choose a Gaussian copula it may be a convenient - but often not - ideal choice.

Could you also speak to the family of meta-Gaussian distributions?

There's no distinction between "distributions with a Gaussian copula" and "meta Gaussian distributions". So we've been discussing the dependence structure of the family of meta-Gaussian distributions all along.

In some areas people are tempted to use the Gaussian copula in multivariate situations because more generally copulas are more work once you move beyond the bivariate case and in some ways the Gaussian case is easy to work with (if you transform the margins to normal you can just fit a multivariate Gaussian). However, there are vine copulas, for example.

The Gaussian copula is frequently inadequate -- it can't model tail dependence, for example, making it unsuitable for the many situations where tail dependence exists. This stuff is pretty well documented in basic books and papers on copulas though. Indeed, misuse of the Gaussian copula to model dependence among debt defaults was credited with making the global financial crisis worse (precisely because as you condition on being in the upper tail, the Gaussian copula does essentially the exact opposite of what's needed for describing the dependence).

The Gaussian copula is most popular when dealing with elliptical distributions, for which there's at least some argument for considering it, since the correlation coefficients still have a relatively direct interpretation. Otherwise, they're just parameters of the dependence structure, and it would usually be better to consider the actual characteristics of the dependence you have (or at least the most essential characteristics of it).

Best Answer

Example

Related Solutions

Solved – Simulate a Gaussian Copula with t margins

Solved – The importance of the Gaussian copula

Related Question