Search code examples
pythonpandasdata-manipulation

Applying Functions in Python


I am an R User that is trying to learn more about Python.

I found this Python library that I would like to use for address parsing: https://github.com/zehengl/ez-address-parser

I was able to try an example over here:

from ez_address_parser import AddressParser

ap = AddressParser()

result = ap.parse("290 Bremner Blvd, Toronto, ON M5V 3L9")
print(results)
[('290', 'StreetNumber'), ('Bremner', 'StreetName'), ('Blvd', 'StreetType'), ('Toronto', 'Municipality'), ('ON', 'Province'), ('M5V', 'PostalCode'), ('3L9', 'PostalCode')]

I have the following file that I imported:

df = pd.read_csv(r'C:/Users/me/OneDrive/Documents/my_file.csv',  encoding='latin-1')

   name                               address
1 name1 290 Bremner Blvd, Toronto, ON M5V 3L9
2 name2 291 Bremner Blvd, Toronto, ON M5V 3L9
3 name3 292 Bremner Blvd, Toronto, ON M5V 3L9

I tried to apply the above function and export the file:

df['Address_Parse'] = df['ADDRESS'].apply(ap.parse)

df = pd.DataFrame(df)
df.to_csv(r'C:/Users/me/OneDrive/Documents/python_file.csv', index=False, header=True)

This seems to have worked - but everything appears to be in one line!

[('290', 'StreetNumber'), ('Bremner', 'StreetName'), ('Blvd', 'StreetType'), ('Toronto', 'Municipality'), ('ON', 'Province'), ('M5V', 'PostalCode'), ('3L9', 'PostalCode')]

Is there a way in Python to make each of these "elements" (e.g. StreetNumber, StreetName, etc.) into a separate column?

Thank you!


Solution

  • Define a custom function that returns a Series and join the output:

    def parse(x):
        return pd.Series({k:v for v,k in ap.parse(x)})
    
    out = df.join(df['ADDRESS'].apply(parse))
    
    print(out)