google-earth-engine – Extracting Qualified Image Patches in Google Earth Engine

clipextractgoogle-earth-engineimage

I want to extract qualified (e.g., cloud-free) image patches (e.g. size of 128×128) in a large image (e.g. 10000×10000) in Google Earth Engine. These image patches can overlap with other qualified patches. I will then feed these image patches to TensorFlow. Searching Earth Engine APIs, so far I only found ee.Image.clip, ee.Image.clipToBoundsAndScale, ee.ImageCollection.getRegion that may related to my goal. But they all require some geometry or feature or feature collections in latitude/longitude coordinates as the boundary parameter.

Is there a way to extract image patches in Earth Engine like slicing sub-array of a numpy array using indices (e.g., image[0:128, 0:128])?


Nicholas Clinton's overall flow works. I made some minor changes to the mask and the sample steps.

For the mask step, I used the ee.Image.fastDistanceTransform to calculate the Chebyshev distance to pixels with non-zero value (swap 0 and 1 in a normal mask, i.e., SwappedMask) and create the buffered mask by setting a threshold:

var Chebyshev_dist_to_Mask = SwappedMask
  .fastDistanceTransform({
    neighborhood: 46,
    units: 'pixels',
    metric: 'chebyshev'
  });
var BufferedMask = Chebyshev_dist_to_Mask
  .gt(ee.Image(33))
  .select(0)
  .toByte()
  .rename('buffered_mask');

enter image description here

For the sample step, I used ee.Image.stratifiedSample instead of ee.Image.sample to sample on good pixel class only.

Best Answer

The overall flow is, you mask the image, convert to array, then sample the arrays at points. Here's a sketch of the solution in Python (where classes is an image in which each pixel stores an integer label 0, 1, 2..., composite is some multi-band image of predictor bands and KERNEL is whatever shape you want:

# Get only patches completely covered by the kernel.
connectedMask = classes.mask().reduceNeighborhood(
  reducer=ee.Reducer.min(), 
  kernel=KERNEL,
)

# Convert to an array per pixel
arrays = composite.addBands(classes).neighborhoodToArray(KERNEL)

# Sample the arrays
sample = arrays.updateMask(connectedMask).sample(
  numPixels=1000, 
  region=SOME_GEOMETRY, 
  scale=30)

# Export the result to a table.
task = ee.batch.Export.table.toCloudStorage(
  collection=sample, 
  description='Sample Export',
  fileNamePrefix='foo', 
  bucket=outputBucket,
  fileFormat='TFRecord', 
  selectors=listOfBands)
task.start()
Related Question