[Math] Why is the maximum value of cross-correlation achieved at similar section

correlationsignal processing

I'm a bit confused and probably need some sleep.

When trying to find a short signal inside a long one (or the delay), it's almost a trivial fact that we should look for the maximal valued coefficient of the cross-correlation.

How is this proven?
what about cases (like in the picture) where both signal are positive valued and there's a section, not the one which matches the short signal, in which the long signal just has higher values?
What am I missing here?
enter image description here

Best Answer

With regard to your question for a proof of the suitability of the maximum value of the cross correlation coefficient the following excerpt from the Wikipedia page on cross correlation might be enough: "As an example, consider two real valued functions f and g differing only by an unknown shift along the x-axis. One can use the cross-correlation to find how much g must be shifted along the x-axis to make it identical to f. The formula essentially slides the g function along the x-axis, calculating the integral of their product at each position. When the functions match, the value of (f\star g) is maximized. This is because when peaks (positive areas) are aligned, they make a large contribution to the integral. Similarly, when troughs (negative areas) align, they also make a positive contribution to the integral because the product of two negative numbers is positive."

With regard to the example of the curves you show, just notice that the cross correlation coefficient is composed in the numerator by the product of the curves but in the denominator has the product of the module of each curve. This means the denominator removes/rescales any effect of the value of the curves under comparison.

Related Question