Google Earth Engine – Extracting Mean Values Using Google Earth Engine

big dataextractgoogle-earth-enginejavascripttypecast

I am trying to extract the dates and mean NDVI values of a processed Google Earth Engine image collection thru Python. All goes well until I try to use getInfo(), where I always encounter this error:

ee.ee_exception.EEException: String.slice, argument 'string': Invalid type. Expected: String. Actual: Float.

But when I run the same code within GEE, it doesn't generate that error. Here is a sample code:

// Initialize //
var roi = ee.Geometry.Polygon(
        [[[122.10367346625048, 7.389991097840974],
          [122.10670974995855, 7.389948538640855],
          [122.10667756321027, 7.392938312477742],
          [122.1036841951634, 7.392938312477729]]]);
var collection = ee.ImageCollection("LANDSAT/LE07/C01/T1_SR")
  .filterDate('1999-01-01', '2018-12-31')
  .filterBounds(roi)
  .select(['B3','B4','pixel_qa']);
print('Collection size (Initial): ', collection.size());
var cfilt = 0; // Threshold for cloud cover %


// Cloud Filtering //
var ctROI = ee.Number(collection.first().select('B3').reduceRegion({ // # of ROI pixels
  reducer:ee.Reducer.count(),
  geometry: roi
  }).get('B3')); 

var cmaskL457sr = function(image){
    var qa = image.select('pixel_qa');
    var cloud = qa.bitwiseAnd(1 << 5).and(qa.bitwiseAnd(1 << 7)).or(qa.bitwiseAnd(1 << 3)).or(qa.bitwiseAnd(1 << 4));
    var mask2 = image.mask().reduce(ee.Reducer.min());
    return image.updateMask(cloud.not()).updateMask(mask2);
};

var cloudScore = function(image){
    var ctCov = image.select('pixel_qa').reduceRegion({
      reducer: ee.Reducer.count(),
      geometry: roi
    });
    var ctpct = ee.Number(100).subtract(ee.Number(ctCov.get('pixel_qa')).divide(ctROI).multiply(100));
    return image.set('cloud_cover_percentage', ctpct);
};

var cMasked = collection.map(cmaskL457sr); // Apply cloud mask
var dcFiltered = cMasked.map(cloudScore).filter(ee.Filter.lte('cloud_cover_percentage', cfilt)); // Apply cloud scoring and filter to threshold
print('Collection size (After cloud filtering): ',dcFiltered.size());


// NDVI Calc //
var addNDVI = function(image){
    var ndvi = image.normalizedDifference(['B4', 'B3']);
    var meanNDVI = ndvi.reduceRegion(ee.Reducer.mean(), roi);
    return image.addBands(ndvi).set(meanNDVI); // Attach mean NDVI values
};

var final = dcFiltered.map(addNDVI).select('nd'); // Final image collection 


// Fetch Data //
var dates_list = ee.List(final.aggregate_array('system:time_start')).getInfo();
var ndmeans_list = ee.List(final.aggregate_array('nd')).getInfo();

Any ideas on why the getInfo() fails? Why does the error say that it expects a string, when getInfo() is used to fetch different kinds of data. Also, it's weird to me that the exact code for GEE works, but not when I use the Python version (Note that the Python code works fine except for that last getInfo() part). Is there a special casting to be done when working with Python GEE code? I have tried various castings but none have worked so far.

Lastly, in the future if I want to generate a seriesbyRegion for a massive (50K+) FeatureCollection, what would be a better way to do that than looping my current code across all regions inside?

Best Answer

I have solved it. The issue lies in a snippet for filtering ROIs that I have not included in the original question because I thought it was not involved in the error. Here is the EE version with no error:

var fidList = ee.List.sequence(0,roiFull.size().subtract(1)); // List of unique ID values inside the FeatureCollection
var fidStr = ee.String(fidList.get(idx)).slice(0,-2); // Fetch an element, idx, convert it to string then remove the decimal values (1.0 -> 1) and use it for filtering

And here is the Python version with error:

fidList = np.arange(roiFull.size().getInfo())  # List of unique ID values inside the FeatureCollection
fidStr = ee.String(fidList[idx])  # Fetch an element, idx, convert it to string and use it for filtering

To fix this, apparently I have to locally cast fidStr:

fidStr = ee.String(str(fidList[idx]))

It seems that EE has great tolerance for these conversions (by virtue of being ComputedObjects?) as long as the final server-side container is properly casted.