[GIS] Cutting a portion of a satellite image based on coordinates in GDAL

gdalgeopandasosgeopandaspython

I have a satellite image of 7-channels (Basically I have seven .tif files, one for each band). And I have a .csv file with coordinates of points-of-interest that are in the region shot by the satellite. I want to cut small portions of the image in the surroundings of each coordinate point. How could I do that?

As I don't have a full working code right now, it really doesn't matter the size of those small portions of image. For the explanation of this question let's say that I want them to be 15×15 pixels. So for the moment, my final objective is to obtain a lot of 15x15x7 vectors, one for every coordinate point that I have in the .csv file. And that is what I am stucked with. (the "7" in the "15x15x7" is because the image has 7 channels)

Just to give some background in case it's relevant: I will use those vectors later to train a CNN model in keras.

This is what I did so far: (I am using jupyter notebook, anaconda environment)

  • imported gdal, numpy, matplotlib, geopandas, among other libraries.

  • Opened the .gif files using gdal, converted them into arrays

  • Opened the .csv file using pandas.

  • Created a numpy array called "imagen" of shape (7931, 7901, 3) that will host the 7 bands of the satellite image (in form of numbers). At this point I just need to know which rows and colums of the array "imagen" correspond to each coordinate point. In other words I need to convert every coordinate point into a pair of numbers (row,colum). And that is what I am stucked with.

After that, I think that the "cutting part" will be easy.

#I import libraries

from osgeo import gdal_array
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas
from geopandas import GeoDataFrame
from shapely.geometry import Point

#I access the satellite images (I just show one here to make it short)

b1 = r"E:\Imágenes Satelitales\2017\226_86\1\LC08_L1TP_226086_20170116_20170311_01_T1_sr_band1.tif"
band1 = gdal.Open(b1, gdal.GA_ReadOnly)

#I open the .csv file

file_svc = "C:\\Users\\Administrador\Desktop\DeepLearningInternship\Crop Yield Prediction\Crop Type Classification model - CNN\First\T28_Pringles4.csv"
df = pd.read_csv(file_svc)
print(df.head())

That prints something like this:

   Lat1        Long1       CropingState
   -37.75737   -61.14537   Barbecho
   -37.78152   -61.15872   Verdeo invierno
   -37.78248   -61.17755   Barbecho
   -37.78018   -61.17357   Campo natural
   -37.78850   -61.18501   Campo natural
#I create the array "imagen" (I only show one channel here to make it short)

imagen = (np.zeros(7931*7901*7, dtype = np.float32)).reshape(7931,7901,7)
imagen[:,:,0] = band1.ReadAsArray().astype(np.float32)

#And then I can plot it:

plt.imshow(imagen[:,:,0], cmap = 'hot')
plt.plot()

Which plots something like this:

(https://github.com/jamesluc007/DeepLearningInternship/blob/master/Crop%20Yield%20Prediction/Crop%20Type%20Classification%20model%20-%20CNN/First/red_band.png)

I want to transform those (-37,-61) into something like (2230,1750). But I haven't figured it how yet. Any clues?

EDIT:

The satellite images are landsat-8 images. According to wikipedia the map projection is UTM. If I run this:

print(gdal.Info(band1))

then I get details of the raster coordinate system:

Driver: GTiff/GeoTIFF
Files: E:\Imágenes Satelitales\2017\226_86\1\LC08_L1TP_226086_20170116_20170311_01_T1_sr_band1.tif
       E:\Imágenes Satelitales\2017\226_86\1\LC08_L1TP_226086_20170116_20170311_01_T1_sr_band1.tif.aux.xml
Size is 7901, 7931
Coordinate System is:
PROJCS["WGS 84 / UTM zone 20N",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433],
        AUTHORITY["EPSG","4326"]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",-63],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AUTHORITY["EPSG","32620"]]
Origin = (548385.000000000000000,-4030485.000000000000000)
Pixel Size = (30.000000000000000,-30.000000000000000)
Metadata:
  AREA_OR_POINT=Area
  Band_1=band 1 surface reflectance
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  548385.000,-4030485.000) ( 62d27'37.04"W, 36d25' 6.01"S)
Lower Left  (  548385.000,-4268415.000) ( 62d26'40.67"W, 38d33'46.31"S)
Upper Right (  785415.000,-4030485.000) ( 59d49' 6.53"W, 36d22'37.78"S)
Lower Right (  785415.000,-4268415.000) ( 59d43'34.98"W, 38d31' 6.21"S)
Center      (  666900.000,-4149450.000) ( 61d 6'44.82"W, 37d28'36.79"S)
Band 1 Block=7901x1 Type=Int16, ColorInterp=Gray
  Description = band 1 surface reflectance
  Min=-1806.000 Max=6547.000 
  Minimum=-1806.000, Maximum=6547.000, Mean=429.912, StdDev=178.705
  NoData Value=-9999
  Metadata:
    STATISTICS_MAXIMUM=6547
    STATISTICS_MEAN=429.9118572176
    STATISTICS_MINIMUM=-1806
    STATISTICS_STDDEV=178.70509060734

But I don't understand yet how to go on from there.

Best Answer

You can convert x,y to col, row coordinates with gdal.InvGeoTransform, and gdal.ApplyGeoTransform. You can reproject coordinates with osr.CoordinateTransformation.TransformPoint.

E.g.

from osgeo import gdal, osr

ds = gdal.Open(someraster)
gt = ds.GetGeoTransform()  # Geotransforms allow conversion of pixel to map coordinates
crs = ds.GetProjection()     

lon, lat = -61, -37  # Assume EPSG:4326 (WGS84) https://epsg.io/4326

# Reproject lon/lat to CRS of raster (UTM in this case)
source = osr.SpatialReference()
source.ImportFromEPSG(4326)  # WGS84 4326

target = osr.SpatialReference()
target.ImportFromWkt(crs)
transform = osr.CoordinateTransformation(source, target)

mx, my, z = transform.TransformPoint(lon, lat)

# Inverse GT to convert from map to pixel
inv_gt = gdal.InvGeoTransform(gt)  

# Apply the inverse GT and truncate the decimal places.
px, py = (math.floor(f) for f in gdal.ApplyGeoTransform(inv_gt, mx, my))