Solved – Why is KDE output negative

kernel-smoothingscikit learn

As far as I know, PDFs always have positive co-domains, but here is an example of one that outputs negative numbers:

http://scikit-learn.org/stable/modules/density.html

from sklearn.neighbors.kde import KernelDensity
import numpy as np
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
kde = KernelDensity(kernel='gaussian', bandwidth=0.2).fit(X)
kde.score_samples(X)
array([-0.41075698, -0.41075698, -0.41076071, -0.41075698, -0.41075698,
       -0.41076071])

Any idea what's going on? And what's the solution?

Here is me trying to do the same but with my own data:
enter image description here

Best Answer

The results are negative because score_samples() returns the log density.

From the help message:

Returns
-------
density : ndarray, shape (n_samples,)
    The array of log(density) evaluations