Solved – Kernel density estimation on bounded support

I was looking for some way to deal with boundary bias of kde in case of a unit interval. One example is an usage of Chen estimators (or Beta estimators; an example might be seen here: http://stats-www.open.ac.uk/TechnicalReports/mcjdah.pdf -p.4) – instead of typical kernel density estimator $$ \hat{f}(x) =\frac{1}{n} \sum_{i=1}^{n}K(x,X_{i};h) $$ we obtain:
$$
\hat{f_{C1}}(x)=\frac{1}{nB(\frac{x}{h^2}+1,\frac{1-x}{h^2}+1)}\sum_{i=1}^{n}X_{i}^{x / h^2}(1-X_{i})^{(1-x) / h^2},
$$
where B() is beta function

The difficulty which I encountered is underflow problems in calculating beta function in case of large values of parameters. For example in R:

data <- runif(10000)

Chen_kde <- function(x,input,h=1/length(input)^(0.9)){

   p = x / h + 1
   q = (1-x) / h + 1

   output = mean(trans_data^(p-1)*(1-trans_data)^(q-1)/beta(p,q))
   return(output)
}

Chen_kde(0.1,data)

Warning message:
In beta(p, q) : underflow occurred in 'beta'

I found that one way to tackle this problem is to approximate a beta distribution with a normal density with equal mean and std deviation. However, each element of above mentioned sum is only "similar" to the beta distribution since x lies in exponent, not in base. My question is if in this example I can also approximate somehow each element to get rid of underflow problems or there can be some other successful methods to correct boundary bias of kde for unit interval.

Solved – Kernel density estimation on bounded support

Best Answer

Related Question

Best Answer

Related Solutions

Solved – Kernel bandwidth in Kernel density estimation

Kernel Smoothing – Savitzky-Golay vs. Kernel Smoothing Filters

Related Question