[Math] A metric for comparing two heatmaps

st.statistics

Say I have two heatmaps:
Each pixel of the heatmap represents a certain probability.

One heatmap is derived from empirical data, and the other heatmap is generated by an algorithm that is designed to simulate the natural process that underlies the empirical data.

I wish to tune the algorithm to make the generated heatmap match up as closely to the empirical heatmap as possible, but this is difficult without a proper metric to actually make a comparison. Thus, I wish to implement a metric that can return a value from 0 to 1 to make this comparison.

I am currently considering vector distance, mutual information, and KL-divergence. I am curious if anyone has experience or advice regarding this. –Thanks!

Best Answer

If this is a high-stake computation, on which you're willing to spend some comp. effort, you could use a $h_{-1}$-Sobolev norm - in effect, compute Fourier coefficients of the difference of the heatmaps, and discount them by the wavenumber, before summing them up. I'm writing this from memory, so please check literature.

$$ d( A, B )^2 = \sum_{k} \frac{ \mathcal{F}(A - B)^2_k }{ 1 + (2\pi \vert k \vert)^2 }$$

Where $A,B$ are the heatmaps, $\mathcal{F}(A - B)_k$ is the Fourier coefficient associated with the wavevector $k=(k_x, k_y)$, $\vert k \vert^2 = k_x^2 + k_y^2$.

This will have a "low-pass-filter" effect on the difference, so instead of just taking a plain $l_2$ norm, which assigns equal weight to variations on large scale (coarse details), the above norm ( $h_{-1}$-Sobolev norm) will discount variations in small-scale details and emphasize alignment of heat maps on the large scale level first. Due to the nature of your heatmaps, one computational, and the other being empirical, I believe you are bound to have small scale variations that you don't care about.

Best Answer

Related Solutions

[Math] Statistical test comparing two relative frequencies

[Math] comparing multiple contingency tables, independent data

Related Question