I have 9888562 records in dataframe and I would like to convert my lat, long to UTM x,y. according to my code, I have used pyproj package but because my data are too much it takes a long time and finally, it doesn't work.
I wonder whether you know another way or package that I can use for my data?
def rule(row):
p = Proj(proj='utm',zone=10,ellps='WGS84', preserve_units=False)
x,y = p(row["LON"], row["LAT"])
return pd.Series({"X": x , "Y": y})
My_data = My_data.merge(My_data.apply(rule, axis=1), left_index= True, right_index= True)
Best Answer
UPDATE:
After thinking about it, the most efficient method for you to transform the coordinates is probably to not use
apply
but to use the column array.Using
Transformer
ORIGINAL ANSWER:
This answer here is great: https://gis.stackexchange.com/a/334276/144357
The solution below is for the purposes of understanding the root of the problem a bit better.
Your code in its current form re-constructs the
Proj
object with each iteration. This is a costly operation and is why thepyproj.Transformer
object was created. It assists with repeated transformations because you don't have to re-create it each time (see: https://pyproj4.github.io/pyproj/stable/advanced_examples.html#repeated-transformations).So, to avoid re-creating the
Proj
object, you can modify your code like so:This should improve your performance.
Here is the equivalent using the
pyproj.Transformer
:Hopefully this is helpful. Good luck!
Also, I would recommend reading this about
Proj
: https://pyproj4.github.io/pyproj/stable/gotchas.html#proj-not-a-generic-latitude-longitude-to-projection-converter