[Math] Scale Space – Scales and Octaves

image processingsignal processing

So I'm desperately trying to understand scale space for signals, specifically for 2D images…

I'm having trouble with algorithms that discuss creating a pyramid. Specifically, I don't understand how we go between resolutions (octaves). Within in a pyramid, I realize that an octave is a difference in resolution (or subsampled) but I don't understand how it relates to the scale parameter. (I understand that smoothing enough times is equivalent to subsampling).

For example, if we have five signals/images in an octave which have iteratively been smoothed with a Gaussian, how do the images at the edges relate to each other? (Image 5 and Image 6). Is Image 6 subsampled from Image 1 or from Image 5?

What about the scale parameter, discussed often throughout many articles on Scale Space, how does it relate to subsampling?

I've been reading papers all day, my head is swimming and I'm still confused.


From Wikipedia, I'm having intense trouble understanding this, but answers what I'm trying to understand:

An image pyramid is a discrete representation in which a scale space is sampled in both space and scale. For scale invariance, the scale factors should be sampled exponentially, for example as integer powers of 2 or root 2. When properly constructed, the ratio of the sample rates in space and scale are held constant so that the impulse response is identical in all levels of the pyramid. Fast, O(N), algorithms exist for computing a scale invariant image pyramid in which the image or signal is repeatedly smoothed then subsampled. Values for scale space between pyramid samples can easily be estimated using interpolation within and between scales.

Best Answer

I found a great book discussing this topic thanks to a coworker:

Computer Vision: A Modern Approach by D. A. Forsyth and J. Ponce

There's a section labeled:

Linear Filters - Technique: Scale and Image Pyramids

Applying concepts from the book with the wikipedia article explaining the scale parameter helped... The whole goal here is to prevent aliasing.

Related Question