I have a dataframe as below:
df = pd.DataFrame(
{
'epsg': [4326, 4326, 4326, 4203, 7844],
'latitude': [-34.58, -22.78, -33.45, -33.60, -30.48],
'longitude': [122.31, 120.2, 118.55, 140.77, 115.88]})
Here is the function to transform the lat/long if it is not based on 4326:
def transfform_lat_long(inproj:int, outproj:int, x1, y1):
proj = pyproj.Transformer.from_crs(inproj, outproj, always_xy=True)
x2, y2 = proj.transform(x1, y1)
return outproj, x2, y2
I try to apply the function over the data frame so that its lat/long/epsg will be updated if the epsg is not 4326
df[['epsg','latitude', 'longitude']] = df.apply(lambda row:
transfform_lat_long(row.epsg, 4326, row.latitude, row.longitude) if row.epsg != 4326)
It produces syntax error. Any help?
May I suggest you an optimization if your dataframe is large. Instead of apply the transformation for each row, apply the transformation for each group of epsg
then avoid to check if the outproj
is inproj
using a boolean mask:
def transfform_lat_long(inproj: int, outproj: int, x1: pd.Series, y1: pd.Series):
proj = pyproj.Transformer.from_crs(inproj, outproj, always_xy=True)
x2, y2 = proj.transform(x1, y1)
return pd.DataFrame({'espg': outproj, 'latitude': x2, 'longitude': y2}, index=x1.index)
inproj = 4326
m = df['epsg'] != inproj
df[m] = (df[m].groupby('epsg', group_keys=False)
.apply(lambda x: transfform_lat_long(x.name, 4326, x['latitude'], x['longitude'])))
Output:
>>> df
epsg latitude longitude
0 4326 -34.58000 122.31000
1 4326 -22.78000 120.20000
2 4326 -33.45000 118.55000
3 4326 inf inf
4 4326 -30.48000 115.88000