Solved – Statistical measure for if an image consists of spatially connected separate regions

pattern recognitionspatialvariance

Consider these two grayscale images:

river
random

The first image shows a meandering river pattern.
The second image shows random noise.

I am looking for a statistical measure that I can use to determine if it is likely that an image shows a river pattern.

The river image has two areas: river = high value and everywhere else = low value.

The result is that the histogram is bimodal:

enter image description here

Therefore an image with a river pattern should have a high variance.

However so does the random image above:

River_var = 0.0269, Random_var = 0.0310

On the other hand the random image has low spatial continuity, whereas the river image has high spatial continuity, which is clearly shown in the experimental variogram:
enter image description here

In the same way that the variance "summarizes" the histogram in one number,
I am looking for a measure of spatial contiuity that "summarizes" the experimental variogram.

I want this measure to "punish" high semivariance at small lags harder than at large lags, so I have come up with:

$\ svar = \sum_{h=1}^n \gamma(h)/h^2 $

If I only add up from lag = 1 to 15 I get:

River_svar = 0.0228, Random_svar = 0.0488

I think that a river image should have high variance, but low spatial variance so I introduce a variance ratio:

$\ ratio = var/svar $

The result is:

River_ratio = 1.1816, Random_ratio = 0.6337

My idea is to use this ratio as a decision criteria for if an image is a river image or not; high ratio (e.g. > 1) = river.

Any ideas on how I can improve things?

Thanks in advance for any answers!

EDIT: Following the advice of whuber and Gschneider here are the Morans I of the two images calculated with a 15×15 inverse distance weight matrix using Felix Hebeler's Matlab function:

River_M
Random_M

I need to summarize the results into one number for each image.
According to wikipedia: "Values range from −1 (indicating perfect dispersion) to +1 (perfect correlation). A zero value indicates a random spatial pattern."
If I sum up the square of the Morans I for all pixels I get:

River_sumSqM = 654.9283, Random_sumSqM = 50.0785 

There is a huge difference here so Morans I seem to be a very good measure of spatial continuity :-).

And here is a histogram of this value for 20 000 permutations of the river image:
histogram of permuations

Clearly the River_sumSqM value (654.9283) is unlikely and the River image is therefore not spatially random.

Best Answer

I was thinking that a Gaussian blur acts as a low-pass filter leaving the large-scale structure behind and removing the high wave-number components.

You could also look at the scale of wavelets required to generate the image. If all the information is living in the small scale wavelets then it is likely not the river.

You might consider some sort of auto-correlation of one line of the river with itself. So if you took a row of pixels of the river, even with noise, and found the cross-correlation function with the next row, then you could both find the location and value of the peak. This value is going to be much higher than what you are going to get with the random noise. A column of pixels is not going to produce much of a signal unless you pick something from the region where the river is.

http://en.wikipedia.org/wiki/Gaussian_blur

http://en.wikipedia.org/wiki/Cross-correlation

Related Question