Search code examples
pythonpandaslatitude-longitudepyprojepsg

Transform laltitude and longitude in python pandas using pyproj


I have a dataframe as below:

df = pd.DataFrame(
    {
     'epsg': [4326, 4326, 4326, 4203, 7844],
     'latitude': [-34.58, -22.78, -33.45, -33.60, -30.48],
     'longitude': [122.31, 120.2, 118.55, 140.77, 115.88]})

Here is the function to transform the lat/long if it is not based on 4326:

def transfform_lat_long(inproj:int, outproj:int, x1, y1):
    proj = pyproj.Transformer.from_crs(inproj, outproj, always_xy=True)
    x2, y2 = proj.transform(x1, y1)
    return outproj, x2, y2

I try to apply the function over the data frame so that its lat/long/epsg will be updated if the epsg is not 4326

df[['epsg','latitude', 'longitude']] = df.apply(lambda row:
    transfform_lat_long(row.epsg, 4326, row.latitude, row.longitude) if row.epsg != 4326)

It produces syntax error. Any help?


Solution

  • May I suggest you an optimization if your dataframe is large. Instead of apply the transformation for each row, apply the transformation for each group of epsg then avoid to check if the outproj is inproj using a boolean mask:

    def transfform_lat_long(inproj: int, outproj: int, x1: pd.Series, y1: pd.Series):
        proj = pyproj.Transformer.from_crs(inproj, outproj, always_xy=True)
        x2, y2 = proj.transform(x1, y1)
        return pd.DataFrame({'espg': outproj, 'latitude': x2, 'longitude': y2}, index=x1.index)
    
    inproj = 4326
    m = df['epsg'] != inproj
    
    df[m] = (df[m].groupby('epsg', group_keys=False)
                  .apply(lambda x: transfform_lat_long(x.name, 4326, x['latitude'], x['longitude'])))
    

    Output:

    >>> df
       epsg  latitude  longitude
    0  4326 -34.58000  122.31000
    1  4326 -22.78000  120.20000
    2  4326 -33.45000  118.55000
    3  4326       inf        inf
    4  4326 -30.48000  115.88000