[Math] Kernel density estimation for heavy-tailed distributions using the champernowne transformation

probabilityprobability distributionsprobability theorystatisticstransformation

I am trying to follow this paper to estimate the density for a heavy-tailed distributions using the champernowne transformation.

Alternative link to the paper

Another alternative link to the paper

However, I do not understand the final step to transform the kernel density estimate of the transformed data back to the untransformed data set.

An outline of the procedure is below:

Firstly, the data, X, is transformed:

enter image description here

Where T() is a modified Champernowne CDF. The parameter alpha, M and c have already been estimated.

Then a Kernel Density Estimate, with a Gaussian kernel is done on the transformed data. However, the data must lie in the interval (0,1), so we only take the that part of the estimated density and then divide by the integral of that part of the density.

enter image description here

enter image description here

The final step, which I don't understand is the formula below. What does the denominator mean?

I understand that the numerator is the estimate of the transformed data set.

I can also see the transformered data set in the denominator, T(), but what is T'?

enter image description here

The authors of the paper then write the following expression for the density estimator of the untransformed dataset:

enter image description here

Best Answer

There is a mistake in the fourth formula, the one you are trying to understand (that is apparent from the last formula where that mistake disappears). Precisely, I mean it should be written $$\hat{f} \left( x \right) = \frac{\hat{f}_{\text{trans}} \left( T_{\hat{\alpha}, \hat{M}, \hat{c}} \left( x \right) \right)}{\left| \left( T^{- 1}_{\hat{\alpha}, \hat{M}, \hat{c}} \right)' \left( x \right) \right|}$$ and not $$\hat{f} \left( x \right) = \frac{\hat{f}_{\text{}} \left( T_{\hat{\alpha}, \hat{M}, \hat{c}} \left( x \right) \right)}{\left| \left( T^{- 1}_{\hat{\alpha}, \hat{M}, \hat{c}} \right)' \left( T_{\hat{\alpha}, \hat{M}, \hat{c}} \left( x \right) \right) \right|}$$

The notation in these formulas are clumsy and not very intuitive, but I will explain how that formula is derived and where the mistake occurs.

The relation between the two random variables $Y$ and $X$ is given by $Y = T \left( X \right)$ (I will denote by $T$ the function $T_{\hat{\alpha}, \hat{M}, \hat{c}}$ to simplify notation). The transformation $T$ is the cumulative distribution function of an absolutely continuous random variable and thus is strictly montonically increasing with unique inverse $T^{- 1}$. Let $t \left( x \right) = T' \left( x \right) = \frac{\partial T \left( x \right)}{\partial x}$ be the density corresponding to $T$. Denote the by $f_X$ and $f_Y$ the densities of $X$ and $Y$. The relation between the two densities is $$ f_X \left( x \right) = f_Y \left( T \left( x \right) \right) t \left( \left. x \right) \right. $$ $$ f_Y \left( y \right) = f_X \left( T^{- 1} \left( y \right) \right) \frac{1}{t \left( T^{- 1} \left( y \right) \right)} $$ This is clear because the jacobian of the transformation $X = T^{- 1} \left( y \right)$ is $t \left( x \right)$. (and not $t \left( T \left( x \right) \right)$ as is assumed in the fourth formula). The term $\left| \frac{1}{\left( T^{- 1} \right)' T \left( x \right)} \right|$ appears mistakenly in the fourth formula because $$ t \left( T \left( x \right) \right) = \left| t \left( T \left( x \right) \right) \right| = \left| T' \left( T \left( x \right) \right) \right| = \left| \frac{1}{\left( T^{- 1} \right)' T \left( x \right)} \right| $$ (Notice that the derivative of the inverse function is the inverse of the derivative of the original function. Also there is no need for the absolute values here because densities are positive. To quickly check the error notice that $T:(0,\infty)\rightarrow (0,1)$ and $t:(0,\infty)\rightarrow (0,\infty)$). In the last formula, the last term is $t \left( x \right)$ and not the incorrect one $t \left( T \left( x \right) \right)$.

Related Question