[GIS] Cutting a portion of a satellite image based on coordinates in GDAL

I have a satellite image of 7-channels (Basically I have seven .tif files, one for each band). And I have a .csv file with coordinates of points-of-interest that are in the region shot by the satellite. I want to cut small portions of the image in the surroundings of each coordinate point. How could I do that?

As I don't have a full working code right now, it really doesn't matter the size of those small portions of image. For the explanation of this question let's say that I want them to be 15×15 pixels. So for the moment, my final objective is to obtain a lot of 15x15x7 vectors, one for every coordinate point that I have in the .csv file. And that is what I am stucked with. (the "7" in the "15x15x7" is because the image has 7 channels)

Just to give some background in case it's relevant: I will use those vectors later to train a CNN model in keras.

This is what I did so far: (I am using jupyter notebook, anaconda environment)

imported gdal, numpy, matplotlib, geopandas, among other libraries.
Opened the .gif files using gdal, converted them into arrays
Opened the .csv file using pandas.
Created a numpy array called "imagen" of shape (7931, 7901, 3) that will host the 7 bands of the satellite image (in form of numbers). At this point I just need to know which rows and colums of the array "imagen" correspond to each coordinate point. In other words I need to convert every coordinate point into a pair of numbers (row,colum). And that is what I am stucked with.

After that, I think that the "cutting part" will be easy.

#I import libraries

from osgeo import gdal_array
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas
from geopandas import GeoDataFrame
from shapely.geometry import Point

#I access the satellite images (I just show one here to make it short)

b1 = r"E:\Imágenes Satelitales\2017\226_86\1\LC08_L1TP_226086_20170116_20170311_01_T1_sr_band1.tif"
band1 = gdal.Open(b1, gdal.GA_ReadOnly)

#I open the .csv file

file_svc = "C:\\Users\\Administrador\Desktop\DeepLearningInternship\Crop Yield Prediction\Crop Type Classification model - CNN\First\T28_Pringles4.csv"
df = pd.read_csv(file_svc)
print(df.head())

That prints something like this:

   Lat1        Long1       CropingState
   -37.75737   -61.14537   Barbecho
   -37.78152   -61.15872   Verdeo invierno
   -37.78248   -61.17755   Barbecho
   -37.78018   -61.17357   Campo natural
   -37.78850   -61.18501   Campo natural

#I create the array "imagen" (I only show one channel here to make it short)

imagen = (np.zeros(7931*7901*7, dtype = np.float32)).reshape(7931,7901,7)
imagen[:,:,0] = band1.ReadAsArray().astype(np.float32)

#And then I can plot it:

plt.imshow(imagen[:,:,0], cmap = 'hot')
plt.plot()

Which plots something like this:

(https://github.com/jamesluc007/DeepLearningInternship/blob/master/Crop%20Yield%20Prediction/Crop%20Type%20Classification%20model%20-%20CNN/First/red_band.png)

I want to transform those (-37,-61) into something like (2230,1750). But I haven't figured it how yet. Any clues?

EDIT:

The satellite images are landsat-8 images. According to wikipedia the map projection is UTM. If I run this:

print(gdal.Info(band1))

then I get details of the raster coordinate system:

Driver: GTiff/GeoTIFF
Files: E:\Imágenes Satelitales\2017\226_86\1\LC08_L1TP_226086_20170116_20170311_01_T1_sr_band1.tif
       E:\Imágenes Satelitales\2017\226_86\1\LC08_L1TP_226086_20170116_20170311_01_T1_sr_band1.tif.aux.xml
Size is 7901, 7931
Coordinate System is:
PROJCS["WGS 84 / UTM zone 20N",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433],
        AUTHORITY["EPSG","4326"]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",-63],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AUTHORITY["EPSG","32620"]]
Origin = (548385.000000000000000,-4030485.000000000000000)
Pixel Size = (30.000000000000000,-30.000000000000000)
Metadata:
  AREA_OR_POINT=Area
  Band_1=band 1 surface reflectance
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  548385.000,-4030485.000) ( 62d27'37.04"W, 36d25' 6.01"S)
Lower Left  (  548385.000,-4268415.000) ( 62d26'40.67"W, 38d33'46.31"S)
Upper Right (  785415.000,-4030485.000) ( 59d49' 6.53"W, 36d22'37.78"S)
Lower Right (  785415.000,-4268415.000) ( 59d43'34.98"W, 38d31' 6.21"S)
Center      (  666900.000,-4149450.000) ( 61d 6'44.82"W, 37d28'36.79"S)
Band 1 Block=7901x1 Type=Int16, ColorInterp=Gray
  Description = band 1 surface reflectance
  Min=-1806.000 Max=6547.000 
  Minimum=-1806.000, Maximum=6547.000, Mean=429.912, StdDev=178.705
  NoData Value=-9999
  Metadata:
    STATISTICS_MAXIMUM=6547
    STATISTICS_MEAN=429.9118572176
    STATISTICS_MINIMUM=-1806
    STATISTICS_STDDEV=178.70509060734

But I don't understand yet how to go on from there.

[GIS] Cutting a portion of a satellite image based on coordinates in GDAL

Best Answer

Related Question

Best Answer

Related Solutions

[GIS] Problem with Converting XY Coordinates to LatLon

[GIS] Getting origin as long, lat in GDAL without knowing EPSG

Related Question