Search code examples
pythonsubstring

extract substring from a string in python?


I have two strings like below -

1)500 Rahway Avenue, Westfield, NJ 07090
2)910 N Harbor Drive, San Diego CA 92101

Want to get 2 Expected output-

1)Westfield
2)San Diego

And

1)NJ
2)CA

I tried below approach for output NJ and CA -

s1.rsplit(" ")[-2]

But this is not the right approach. Any help would be appreciated.


Solution

  • Assuming the city is placed right before the zip code and that zip code always has two capitals, then using a regular expression you could do it like this:

    import re
    
    s = """500 Rahway Avenue, Westfield, NJ 07090
    910 N Harbor Drive, San Diego CA 92101"""
    
    results = re.findall(r", ([^,]*),? ([A-Z]{2}\b)", s)
    
    print(results)
    

    Output:

    [('Westfield', 'NJ'), ('San Diego', 'CA')]
    

    With the zip function you can turn that into a sequence of cities and of zipcodes:

    cities, zipcodes = zip(*re.findall(r", ([^,]*),? ([A-Z]{2}\b)", s))
    
    print(cities)
    print(zipcodes)
    

    Output:

    ('Westfield', 'San Diego')
    ('NJ', 'CA')
    

    When you deal with separate strings for each line, you could also use that same regex as follows:

    import re
    
    lst = [
        "500 Rahway Avenue, Westfield, NJ 07090",
        "910 N Harbor Drive, San Diego CA 92101"
    ]
    
    for s in lst:
        city, zipcode = re.search(r", ([^,]*),? ([A-Z]{2}\b)", s).groups()
        print(city, zipcode)