Solved – High kurtosis and bad skewness

data transformationgarchkurtosisskewness

Is it necessary to have data that looks normal if you want to apply a dynamic correlation coeffient (dcc)?

Should the explanatory variables in the dcc method also be normalized?

My dataset has a high kurtosis and bad skewness. I used the log10 to try to make it more normal, but remarkably the kurtosis is much higher after taking the log… Very strange.

What other options do I have to make my data more normal and how do I apply that in Stata?

Best Answer

I would imagine the DCC suffers the same limitations as the regular correlation with non-normal data. That is, there isn't an assumption of normality, but non-normal data can cause odd findings; see the Anscombe quartet, for example.

As for kurtosis, taking the log can certainly make it worse. Take this example of the uniform distribution:

set.seed(2810101)
x <- runif(100)
logx <- log(x)
library(moments)
kurtosis(x)
kurtosis(logx)

where a Normally distributed variable has kurtosis of 3.

on the other hand, in this example

set.seed(2829101)
z <- c(rnorm(1000, 10, 1), rnorm(1000, 10, .01))
kurtosis(z)
kurtosis(log(z))

However, you mention skewed data with kurtosis. Was your data right skew or left skew? Since the former is more common, I'll guess that.

set.seed(1919110)
x <- c(rnorm(1000, 10, 1), rnorm(300, 30, 2), runif(10, 500, 600))
skewness(x)
kurtosis(x)
skewness(log(x))
kurtosis(log(x))

Here, taking the log improves kurtosis and skewness.

Taking the log had almost no effect on kurtosis.

As always, try plotting the data to see what is going on in your correlation.

Best Answer

Related Solutions

Solved – Transforming extremely skewed distributions

Solved – Should we teach kurtosis in an applied statistics course? If so, how