Google Earth Engine – Scaling Behavior of Discrete vs Continuous Image Data

floating pointgoogle-earth-enginehistogramintegerscale

I have a simple question related to how GEE determines continuous vs discrete data, specifically during scaling operations as described here.

I am unable to find a proper description of this, but my assumption is that float data is always treated as continuous and integer always as discrete. So that an integer image would need to be recast to float to be scaled as continuous and vice versa.

Can anyone confirm this?

In any case, there are a couple of things that confuse me in the example I give below.

First, the printAtScale function (taken and adapted from here) gives the same results (integers) for the integer and the float data. I would expect decimal numbers representing the mean of the area for the latter.

Second, a histogram of integer data shows discrete x-axis values when plotting a single image band, but value ranges when plotting two images together, which suggests continuous values (maybe just a glitch in the plotting?).

Third, the n-counts are decimal numbers (data not shown), which makes no sense in a histogram.

I realize the histogram issues are not necessarily related to the scale question, but the latter came up while creating the histograms.

// Test behavior of scaling

var geometry = /* color: #0B4A8B */ee.Geometry.Point([-99.6759033203125, 47.77185170705089]);

var COL_FILTER = ee.Filter.and(
    ee.Filter.bounds(geometry),
    ee.Filter.date('2021-05-01', '2021-05-15'));
    
var dwCol = ee.ImageCollection('GOOGLE/DYNAMICWORLD/V1').filter(COL_FILTER);

print(dwCol.getInfo());

var dwImList = dwCol.toList(dwCol.size());
var dwLab1 = ee.Image(dwImList.get(0)).select('label');
var dwLab2 = ee.Image(dwImList.get(1)).select('label');
var dwLab = dwLab1.addBands(dwLab2);
var dwLab1cont = dwLab1.toFloat();

print(dwLab1.getInfo());
print(dwLab1cont.getInfo());
// print(dwLab2.getInfo());
// print(dwLab.getInfo());

var printAtScale = function(image, scale) {
  print('Pixel value at '+scale+' meters scale',
    image.reduceRegion({
      reducer: ee.Reducer.first(),
      geometry: image.geometry().centroid(),
      // The scale determines the pyramid level from which to pull the input
      scale: scale
  }).get('label'));
};

printAtScale(dwLab1, 10); 
printAtScale(dwLab1, 50); 
printAtScale(dwLab1, 100); 
printAtScale(dwLab1, 500); 

printAtScale(dwLab1cont, 10); 
printAtScale(dwLab1cont, 50); 
printAtScale(dwLab1cont, 100); 
printAtScale(dwLab1cont, 500); 



var chart = 
    ui.Chart.image.histogram({image: dwLab, scale: 500})
        .setSeriesNames(['Land Cover 1', 'Land Cover 2'])
        .setOptions({
          title: 'Two Dates',
          hAxis: {
            title: 'Cover Class',
            titleTextStyle: {italic: false, bold: true},
          },
          vAxis:
              {title: 'Count', titleTextStyle: {italic: false, bold: true}},
          colors: ['cf513e', '1d6b99']
        });
print(chart);

var chart2 = 
    ui.Chart.image.histogram({image: dwLab1, scale: 500})
        .setSeriesNames(['Most Likely Land Cover'])
        .setOptions({
          title: 'One date. Integer',
          hAxis: {
            title: 'Cover Class',
            titleTextStyle: {italic: false, bold: true},
          },
          vAxis:
              {title: 'Count', titleTextStyle: {italic: false, bold: true}},
          colors: ['cf513e']
        });
print(chart2);

histogram of 1 band integer image

histogram of 2 band integer image

Best Answer

The pyramiding policy is specified when the images are ingested, on a per-band basis. It has nothing to do with the pixel type (although the discrete bands like label are ingested with a 'mode' policy, so pixels aren't averaged at lower levels of the pyramid).

Your printAtScale function is just looking for the first pixel at the given scale; that will be nearest neighbor sampled from the closest pyramid level, and that pyramid level was made by taking the mode of the 4 higher-resolution pixels, not the mean (the mean of the labels would be a meaningless value). However, mean is used when pyramiding the other bands.

The histogram function doesn't know what you're histogramming, it's just counting values that fall into each bucket. In the first case, the charting code displays the bucket limits because it doesn't want to rely on the bands all being the same type; in the second, it doesn't (ie: it's just a display difference because there's more than 1 band; no difference in the computation).

You're getting decimal counts because you're specifying a scale larger than the native resolution, and at that resolution, some pixels aren't fully populated, so there's a partial mask reflecting that fact. The histogram reducer uses the mask as a weighting. It's hard to "undo" this with the chart helpers, but you can run your own reduceRegion with a histogram and set it to be unweighted.

Related Question