[Math] understanding the convolution in signals and systems

signal processing

Hi : I've been reading introductions to signals and systems but my background is probability and statistics. In probability, the concept of convolution makes perfect sense to me. If $t$ is a random variable with density $g(t)$ and $g(t) = f(x)h(y)$, then $P(T = t) = \int f(\tau)h(t-\tau) d \tau$. This is because the integral represents all the various ways that $x$ and $y$ can add to become the value $t$. A similar argument holds for the discrete case.

But when I'm reading the various "intros to signals and systems texts", they describe a step by step process where you flip the h function over the y-axis, so that its $h(-\tau)$, add t, and then keep moving it over to the right step by step and calculating the area ( i.e.: the value of the integral at each time $\tau$ ) by calculating the area covered by the overlapped values of $f$ and $h$. This doesn't make sense to me. In other words, is there some analog in signals and systems to the "various ways that $x$ and $y$ can add to become the value $t$ in probability . Clearly, in the signals and systems framework, $t$ is not a random variable so the interpretation has to be totally different. But I'm wondering if there is some way of REALLY understanding why this is done. Maybe there is some physical reason due to what the function $h$ actually represents. They keep referring to it as the impulse response function so maybe my lack of understanding is because I'm not understanding what the impulse response function really represents ? I just don't get why you do the flip over the y-axis thing and then move it slowly to the right step by step and keep figuring out the overlapped areas.

My take is that it's really essential to understand convolution in signals and systems or else you cannot go an further. So I stopped and decided to ask here because every book seems to give the same step by step overlap explanation and I'm continuously stumped by it. I realize it's a lot to ask for an explanation but maybe someone knows of a text that explains the WHY part of the process or possibly relates it to the convolution in probability. Thank you very much for any help, wisdom, references, links etc.

                                                                   Mark

Best Answer

It might help to look at a discrete time system.

Suppose you have a linear time-invariant system with 'impulse' response $t \mapsto h_t$, that is, with input $u = 1_{\{0\}}$ (that is, one for $t = 0$ and zero everywhere else).

By linearity, if the input is $u = \sum u_k 1_{\{k\} }$ (that is, $u=(u_0,u_1,...)$), then the output will have the combined responses from each separate $ u_k 1_{\{k\} }$, appropriately delayed.

At time $t$, the input $ u_0 1_{\{0\} }$ will contribute $u_0 h_{t-0}$.

At time $t$, the input $ u_1 1_{\{1\} }$ will contribute $u_1 h_{t-1}$.

At time $t$, the input $ u_k 1_{\{k\} }$ will contribute $u_k h_{t-k}$.

Etc, etc.

Combining gives the response $y_t = \sum h_{t-k} u_k$.

For continuous systems, we can informally think of $u(t) = \int u(\tau) \delta(t-\tau) d \tau$. For a fixed $\tau$, the 'input' $t \mapsto u(\tau) \delta(t-\tau)$ results in a contribution $t \mapsto u(\tau) h(t-\tau)$, hence the total combined response is $y(t) =\int u(\tau) h(t-\tau) d \tau$.