Multivariate Analysis – Computationally Efficient Estimation of Multivariate Mode

continuous dataestimationmodemultivariate analysis

Short version: What's the most computationally efficient method of estimating the mode of a multidimensional data set, sampled from a continuous distribution?

Long version: I've got a data set that I need to estimate the mode of. The mode does not coincide with the mean or median. A sample is shown below, this is a 2D example, but an N-D solution would be better:
enter image description here

Currently, my method is

Calculate kernel density estimate on a grid equal to the desired
resolution of the mode
Look for the greatest calculated point

Obviously, this calculates the KDE at a lot of non-plausible points, which is especially bad if there are a lot of data points of high dimensions or I expect good resolution on the mode.

An alternate would be to use a simulated annealing, genetic algorithm, etc to find the global peak in the KDE.

The question is whether there's a smarter method of performing this calculation?

Best Answer

The method that would fit the bill for what you want to do is the mean-shift algorithm. Essentially, mean-shift relies on moving along the direction of the gradient, which is estimated non-parametrically with the "shadow", $K'$ of a given kernel $K$. To wit, if the density $f(x)$ is estimated by $K$, then $\nabla f(x)$ is estimated by $K'$. Details of estimating the gradient of a kernel density are described in Fukunaga and Hostetler (1975), which also happened to introduce the mean-shift algorithm.

A very detailed exposition on the algorithm is also given in this blog entry.

REFERENCES:

K. Fukunaga and L. Hostetler, "The estimation of the gradient of a density function, with applications in pattern recognition, " IEEE Transactions on Information Theory 21(1), January 1975.

Related Solutions

Solved – Computationally efficient Gaussian MAP estimation algorithm in MATLAB

While this method doesn't do anything clever in terms of the structure or algorithm, it is quicker in Matlab to do the following:

Notice that \begin{equation} \hat{x}=\Sigma(\Sigma+\sigma^2)^{-1}y \end{equation}

or even better (after seeing Alexey's answer) \begin{equation} \hat{x}=y - \left (I+\frac{1}{\sigma^2}\Sigma \right )^{-1}y \end{equation}

We can compare these in Matlab to the initial naive implementation using the code

n = 2000;

A = rand(n);
y = rand(n,1);

% Naive
tic
inv(eye(n) + inv(A))*y;
toc

% Mine
tic
B = A+eye(n);
A*(B\y);
toc

% Alexey
tic
B = A+eye(n);
y - B\y;
toc

I get answers of

Elapsed time is 1.126651 seconds.
Elapsed time is 0.246517 seconds.
Elapsed time is 0.202166 seconds.

In order to get any more really significant speedup, I'm guessing you would have to start taking advantage of the structure of your matrices if possible.

Best Answer

Related Solutions

Solved – Computationally efficient Gaussian MAP estimation algorithm in MATLAB

Related Question