Solved – Kernel density estimation bandwidth Rule of thumb: 2.575 factor

density-estimationkernel-smoothing

The paper by Epperlein and Smillie "Cracking VAR with kernels" reports usage of a $2.575\sigma/N^{-1/5}$ bandwidth for kernel quantile estimation, differing from the usual Deheuvels/Silverman 1.059 factor used for density estimation. Silverman is cited as a source. Are the different settings the reason for such a discrepancy (which I doubt given the common exponent, as opposed to a common -1/3 for quantiles), is it a different kernel or something else? Any other (online) source for such constant?

Best Answer

I think the part you're asking about is this part of the paper:

It can be shown (See Silverman 1986) that the optimal choice for $h$ (in the sense of minimising the mean square error) for the triangular kernel is

$h=2.575\sigma N^{-\frac15}$

Note that the paper you mention references Silverman 1986 ... whose book is on density estimation (that's even the title). The bandwidth calculation they refer to is for density estimation.

This particular one derives from Silverman's optimal bandwidth estimator (see here for example). If we start with equation 3.2.1 in Silverman's book (that you mentioned):

$$h_\text{opt}=k_2^{-2/5}\left\{ \int K(t)^2 \text{d}t \right\}^{1/5}\left\{ \int f''(x)^2\text{d}x \right\}^{-1/5} n^{-1/5}$$

where $k_2=\int t^2 K(t) dt$.

and substitute a particular assumption for $f$, then a particular kernel for $K$ will yield an optimal bandwidth. Different kernels have different constants. Using Gaussian for both gives 1.059

The figure 2.575 is for the triangular kernel. I currently don't know for certain what is used for $f$ to get that but my current belief is that it's based on Gaussian $f$. (Edit: yes, it looks like that belief was correct; I took some time to sit and work it through.)

Calculating:

$K(t) = (1-|t|) I_{[-1,1]}$

$k_2=\int t^2 K(t) dt = \frac16$; $\int K(t)^2 \text{d}t = \frac23$; $\int f''(x)^2\text{d}x = \frac{3}{8\sqrt{\pi}}\sigma^{-5}$

$$h_\text{opt}=k_2^{-2/5}\left\{ \int K(t)^2 \text{d}t \right\}^{1/5}\left\{ \int f''(x)^2\text{d}x \right\}^{-1/5} n^{-1/5}$$

$$ = (36\cdot \frac23\cdot \frac{8\sqrt{\pi}}{3})^{\frac15}\sigma n^{-\frac15}$$

$$ = (64\sqrt{\pi})^{\frac15}\sigma n^{-\frac15}$$

$$ = 2.576 \sigma n^{-\frac15}$$

The discrepancy in the last figure is almost certainly due to premature rounding (specifically, it may be due to taking the rounded off value for a value related to $\int f''(x)^2\text{d}x$ in the Gaussian case directly from Silverman).