I have turned some R scripts for data cleaning/standardization of shapefiles into Python scripts.
Here is the R code:
x<-c(farm,field,"trialyield",2017)
file<-paste(x, collapse="_")
yield <- readOGR(".", file)
#identify column of dry yield
yield.df<-data.table::as.data.table(yield)
yield<-yield[,4]
names(yield@data)[1] <- "yield"
and here is what I did to try and convert the block of R code in Python:
y = farm + ' ' + field + ' ' + 'trialyield 2017'
y = y.replace(' ', '_')
trialyield = gpd.read_file(y)
title_col = trialyield.columns[0]
name_map = dict(zip(trialyield.columns[[4]], ['yield']))
trialyield.rename(columns=name_map, inplace=True)
trialyield[[title_col, 'yield']].to_file('trialyield_output.shp')
However, I get the current attribute error:
Traceback (most recent call last):
File "./sommer_uofiaddington1_cleanandagg_2017.py", line 48, in <module>
trialyield[[title_col, 'yield']].to_file('trialyield_output.shp')
File "/opt/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py",
line 3614, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'to_file'
if I recall gpd.read_file
returns a GeodataFrame.
Why is it returning a DataFrame?
Best Answer
If you don't select geometry column from a GeoDataFrame, you get a DataFrame.
For example:
Change the last line in following way: