Google Earth Engine Error – Avoid User Memory Limit Exceeded When Performing Temporal Interpolation in GEE

google-earth-enginegoogle-earth-engine-javascript-api

I am trying to perform a vegetation analysis in the Comunitat Valenciana region using Sentinel-2 L2A data. This is a relatively small region (about 23000 km2) in eastern Spain. This area is located between H30 and H31, so there are Sentinel-2 tiles of both. In my case, I chose to work with H30 so I filtered only the images in the tiles: ['30TYL', '30TYK', '30SYJ', '30SYH', '30SXH', '30SXJ', '30TXK']. I then applied cloud and shadow masking using the SCL (Scene Classification) band available at processing level 2A.

// Search criteria to filter the Sentinel-2 L2A data
var START_DATE = '2021-01-01';
var END_DATE = '2021-12-31';
var MAX_CLOUDS = 75;
var TILES = ['30TYL', '30TYK', '30SYJ', '30SYH', '30SXH', '30SXJ', '30TXK'];
var AOI_FILE = 'users/sermomon/datasets/base_data/CV_region_main_ign_etrs89';

// Area of Interest (AOI): Comunitat Valenciana region (Spain) ~23250 Km^2
var AOI = ee.FeatureCollection(AOI_FILE);
Map.centerObject(AOI);
Map.addLayer(AOI, {'color':'#f02213'}, 'Communitat Valenciana (Spain)');
print('Area of Interest', AOI);

// Load Sentinel-2 L2A data from 'COPERNICUS/S2_SR_HARMONIZED'
var s2 = ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')
    .filterBounds(AOI)
    .filterDate(START_DATE, END_DATE)
    .filter(ee.Filter.lte('CLOUDY_PIXEL_PERCENTAGE', 75))
    .filter(ee.Filter.inList('MGRS_TILE', TILES)); // ETRS89 30N only
print('Raw Sentinel-2 L2A image collection', s2);

// Cloud and shadow masking using the SCL band available in the L2A
// product and subset the bands of interest (B4: 'Red', B8: 'NIR', B11: 'SWIR')
var cloudShadowMask = function(image){
  var scl = image.select('SCL');
  var mask = scl.eq(4).or(scl.eq(5)).or(scl.eq(6)).or(scl.eq(11));
  return image.updateMask(mask);
};
var s2 = s2.map(cloudShadowMask).select(['B4', 'B8', 'B11']);
print('Cloud masked Sentinel-2 L2A image collection', s2);

Now I am trying to apply a temporal interpolation to fill the gaps due to masked pixels following Ujaval Gandhi's SpatialThoughts tutorial. Although it is not an extremely large area the temporal interpolation process is computationally demanding. If I try to process the whole image collection at once I get the error User Memory Limit Exceeded.

I was able to avoid this error by performing the temporal interpolation tile by tile. So I added a statement to select the tile to process before performing the temporal interpolation, as shown below:

//////////////// SUBSET BY TILES TO AVOID USER MEMORY LIMIT EXCCEED ////////////////
var TILE_SUBSET = ['30TXK'];
var s2 = s2.filter(ee.Filter.inList('MGRS_TILE', TILE_SUBSET));
print('Processing Sentinel-2 L2A subset: ', TILE_SUBSET[0], s2);
////////////////////////////////////////////////////////////////////////////////////

This way I am able to process each of the tiles separately.

// Gap filling using temporal interpolation (30 days: 15 backward + 15 forward) code
// adapted from SpatialThoughts: Temporal Gap-Filling with Linear Interpolation in GEE
// https://spatialthoughts.com/2021/11/08/temporal-interpolation-gee/

var DAYS = 15; // days

// Add a timestamp band using the 'system:time_start' property. Required to perform
// the linear temporal interpolation of each pixel
var addTimestamp = function(image){
  var timeImage = image.metadata('system:time_start').rename('timestamp');
  var timeImageMasked = timeImage.updateMask(image.mask().select(0));
  return image.addBands(timeImageMasked);
};
var s2 = s2.map(addTimestamp);

// Filter used to find images before and after (in milliseconds)
var millis = ee.Number(DAYS).multiply(1000*60*60*24);
var maxDiffFilter = ee.Filter.maxDifference({
  difference: millis,
  leftField: 'system:time_start',
  rightField: 'system:time_start'
});
// Filter to find images after each image
var lessEqFilter = ee.Filter.lessThanOrEquals({
  leftField: 'system:time_start',
  rightField: 'system:time_start'
});
// Filter to find images before each image
var greaterEqFilter = ee.Filter.greaterThanOrEquals({
  leftField: 'system:time_start',
  rightField: 'system:time_start'
});

// Join1: find all images after each image inside the temporal window
var filter1 = ee.Filter.and(maxDiffFilter, lessEqFilter);
var join1 = ee.Join.saveAll({
  matchesKey: 'after',
  ordering: 'system:time_start',
  ascending: false});
var join1Result = join1.apply({
  primary: s2,
  secondary: s2,
  condition: filter1
});

// Join2: find all images before each image inside the temporal window 
// in ascending order
var filter2 = ee.Filter.and(maxDiffFilter, greaterEqFilter);
var join2 = ee.Join.saveAll({
  matchesKey: 'before',
  ordering: 'system:time_start',
  ascending: true});
var join2Result = join2.apply({
  primary: join1Result,
  secondary: join1Result,
  condition: filter2
});

// Perform a linear interpolation using the formula: 
// y = y1 + (y2-y1)*((t – t1) / (t2 – t1), where
// y = interpolated image,
// y1 = before image,
// y2 = after image,
// t = interpolation timestamp,
// t1 = before image timestamp,
// t2 = after image timestamp,
var interpolateImages = function(image){
  var image = ee.Image(image);
  // We get the list of before and after images from the image property
  // Mosaic the images so we a before and after image with the closest unmasked pixel
  var beforeImages = ee.List(image.get('before'));
  var beforeMosaic = ee.ImageCollection.fromImages(beforeImages).mosaic();
  var afterImages = ee.List(image.get('after'));
  var afterMosaic = ee.ImageCollection.fromImages(afterImages).mosaic();
  // Get image with before and after times
  var t1 = beforeMosaic.select('timestamp').rename('t1');
  var t2 = afterMosaic.select('timestamp').rename('t2');
  var t = image.metadata('system:time_start').rename('t');
  var timeImage = ee.Image.cat([t1, t2, t]);
  var timeRatio = timeImage.expression('(t - t1) / (t2 - t1)', {
    't': timeImage.select('t'),
    't1': timeImage.select('t1'),
    't2': timeImage.select('t2'),
  });
  // Compute an image with the interpolated image y
  var interpolated = beforeMosaic
    .add((afterMosaic.subtract(beforeMosaic).multiply(timeRatio)));
  // Replace the masked pixels in the current image with the average value
  var result = image.unmask(interpolated);
  return result.copyProperties(image, ['system:time_start']);
};

// Perform interpolation in all images of the collection
var s2Fill = ee.ImageCollection(join2Result.map(interpolateImages));
print('Gap-filled Sentinel-2 L2A subset', TILE_SUBSET[0], s2Fill);

What strategies are available to process the entire region automatically? I am thinking of wrapping the interpolation process in a function and map this function over the list of tiles. But I don't know if this is a good option and I don't know how to implement it.

Is there any other approach to do it?

Edit:

My processing chain includes only one more step before exporting as a GEE asset using batch.Download.ImageCollection.toAsset. In the last step the images of the s2Fill collection are used to create monthly average composites. To do so, I map a process which uses a mean reducer over a list of months. However, this process also exceeds memory limits, as I show below:

// Compute monthly composites of spectral indices
var months = ee.List.sequence(1, 12);
var monthlyComposite = function(month){
  var composite = s2Fill
                  .filter(ee.Filter.calendarRange(month, month, 'month'))
                  .reduce(ee.Reducer.mean());
  var firstImg = s2Indx.filter(ee.Filter.calendarRange(month, month, 'month')).first();
  var t = ee.Date(firstImg.get('system:time_start'));
  var mmm_yyyy = ee.Date(t).format('MMM_YYYY');
  return composite.set('system:id', mmm_yyyy);
};
var s2Comp = months.map(monthlyComposite);
print('Monthly composites Sentinel-2 L2A subset', s2Comp);

So it could be a good idea to export s2Fill to a new asset as an intermediate step, before calculating the monthly composites.

Full script here: https://code.earthengine.google.com/1e1361071016376877a5d76b42153e73

Full script here (edit): https://code.earthengine.google.com/f63e7297247c27ad632b1fd9cd01ed7d

Best Answer

I am exactly facing the same challenge with the gap-filling code by Ujaval (great piece of work!) but doing it for a slightly different purpose (multitemporal classification - I need to consider the phenology). Using also CloudScore+ instead of other cloud masking options. My memory limit issue remains also (still) unsolved so it is great to see more people talking about it. I have tried to run the process by splitting the "aoi" (quite big) in tiles within a loop while exporting. These tiles were created by calculating a covering grid over the aoi. This allows me to play with different tile sizes (scale parameter) but I still can not find a proper solution to run all in one go even if making small tiles (lower scale parameter); still out of memory. I am not sure but may the issue remain on the area of interest (the non-tiled) I use while filtering the collection? May this be unnecesarilly increasing the memory usage? Any clues on different strategies are welcome.

@NoelGorelick; imagine you want to create a bimonthly time series within a year (6 medians or means, etc.). Sometimes the clouds are so persistent, specially during the rainy season, that some areas remain masked. Then you may want to expand a bit the time-frame to get at least the two closest cloud-free images to fill the gaps, but only for the masked areas while maintaining the median values within your bimonthly time range.

E.g. (Target Year 2020)

  1. Filter collection e.g. 2019-11-01, 2021-02-01
  2. Interpolate images within the collection. Allow e.g. 90 days backward/fordward.
  3. Filter the Interpolated collection to 2020-01-01, 2021-01-01
  4. Calculate bimonthly medians with interpolated Image collection in 1 but filtered in 3: ([[1,2], [3,4], [5,6], [7, 8], [9, 10], [11, 12]]);

It may not be ideal because you may go a bit far from your phenological target, but at least you can get a proxy. Sometimes works fine, sometimes it looks like an artifact but it also helps if you are combining these with Sentinel-1 imagery, thus you are not forced to mask these also.

Indeed, this does not make much sense if you only apply the median for one single time range. You need to apply it twice: One for the interpolation (1) and the other for the calculation of medians (3).

Any ideas are welcome :) @Noel, thanks for all your great posts!

Related Question