If I understand you correctly I think you can solve it like this (their are comments in the code that explain what is going on):
import numpy, arcpy, random
#Establish the extent which your random samples can be within
rangeX = (100, 2500000) # Enter the actual range in x values of your rasters * 100 in order to get coordinates with decimals
rangeY = (100, 2500000) # Enter the actual range in y values of your rasters * 100 in order to get coordinates with decimals
qty = 1000 # Enter in the number greater than random points you need
#Generate random x,y coordinates
randPoints = []
while len(randPoints) < qty:
x = random.randrange(*rangeX)/100.0 # divide by 100.0 to be able to get coordinates with decimal values
y = random.randrange(*rangeY)/100.0 # divide by 100.0 to be able to get coordinates with decimal values
randPoints.append((x,y))
#Create dictionary of key and lists, list will house tuples of (x,y,z)
#Enter in actual classified values for dictionary keys
valueDict = {'Class1' : [],
'Class2' : [],
'Class3' : [],
'Class4' : []}
######Get Rasters bands as well as cell height, width, origin info to be able to get
######index of x,y location in the numpy array
arcpy.env.workspace = inPath + '\\aster.img'
bands = arcpy.ListRasters()
Ras = arcpy.Raster(inPath + '\\aster.img')
originX = Ras.extent.upperLeft.X
originY = Ras.extent.upperLeft.Y
pixelWidth = Ras.meanCellWidth
pixelHeight = Ras.meanCellHeight
#Create a list that houses each raster array
bandsList = []
for i in bands:
bandsList.append(arcpy.RasterToNumPyArray(i).astype(numpy.float32))
#loop over all of the random point locations and collect raster values at their
#locations if the dictionary entry for that value is not full populate it
#with a tuple of (x,y,z), keep going until each class is full
for i in randPoints:
X = i[0]
Y = i[1]
xOffset = int((X-originX)/pixelWidth)
yOffset = int(abs(Y-originY)/pixelHeight)
for j in range(0,len(bands)):
sampleValue = bandsList[j][yOffset, xOffset]
for key in valueDict.keys():
if sampleValue == key:
if len(valueDict[key]) < 10:
valueDict[key].append((X, Y, sampleValue))
break
else:
continue
This is a variation of a script that I have used to extract raster values at random x,y locations, so it may need some tweaking but I think the major elements are their to get the job done for you.
I made an answer with your toy lines but not with gdal (easy enough to change if you like). I decided to make a blank raster, presumably of which you know the extent? and then populate the table using extract:
# make the blank raster, the extents and resolution of which are up to you.
# the cells are numbered 1 to ncell (starting in the top left in R)
r <- raster(xmn=0,xmx=4,ymn=0,ymx=4,res=c(2,2))
r[] <- 1:ncell(r)
# make your toy.lines as above, except i've thrown in a 3rd to be daring
# why do you have length?
df <- rbind(c(1,1), c(4,4),c(3,1), c(3,4))
coord_1 <- Line(rbind(c(1,1), c(4,4)))
coord_2 <- Line(rbind(c(3,1), c(3,4)))
coord_3 <- Line(rbind(c(0,1), c(4,1)))
l1 <- Lines(list(coord_1), ID = "road1")
l2 <- Lines(list(coord_2), ID = "road2")
l3 <- Lines(list(coord_3), ID = "road3")
Sl <- SpatialLines(list(l1, l2, l3))
df <- data.frame(len = sapply(1:length(Sl), function(i) gLength(Sl[i, ])))
rownames(df) <- sapply(1:length(Sl), function(i) Sl@lines[[i]]@ID)
toy.lines <- SpatialLinesDataFrame(Sl, data = df)
toy.lines@data$polygon_id <- 1:nrow(toy.lines@data)
# use the extract function which will return a list, one element for every spatial line,
# saying which cells it passes through. Named elements by id.
l <- extract(r,toy.lines)
names(l) <- toy.lines@data$polygon_id
# make a data table of cell ID and what line it touches, unlisting the list elements
# and turning them into a long vector (and inserting that as a column)
cells.df <- data.table(cell=unlist(l),lineid=rep(as.numeric(names(l)),lapply(l, function(x) length(x))))
setkey(cells.df,cell)
# make a simple data table of all cell numbers (will enable us to keep NA cells)
all.cells <- data.table(cell=1:ncell(r))
setkey(all.cells,cell)
# merge by key. Data table orders by key and autopopulates NAs
final.df <- cells.df[all.cells]
That produces table B anyway. Unsure as to why it believes line 1 to pass through cell 4, but there you go.
Best Answer
There is an option in GDAL to rasterize polygons based on their attribute. But as far as I know it can not be string. But you can just add an attribute to your features and then give each feature a unique id. Let's say we call this field
ID
.Open your shapefile
Create the destination raster data source
Here is the important part. Instead of setting a general
burn_value
, useoptions
and set it to the attribute that contains the relevant unique value["ATTRIBUTE=ID"]
Add a spatial reference
Now you have your shapefile as a raster and can read it with
gdal.Open('temp.tif').ReadAsArray()