I have two strings like below -
1)500 Rahway Avenue, Westfield, NJ 07090
2)910 N Harbor Drive, San Diego CA 92101
Want to get 2 Expected output-
1)Westfield
2)San Diego
And
1)NJ
2)CA
I tried below approach for output NJ and CA -
s1.rsplit(" ")[-2]
But this is not the right approach. Any help would be appreciated.
Assuming the city is placed right before the zip code and that zip code always has two capitals, then using a regular expression you could do it like this:
import re
s = """500 Rahway Avenue, Westfield, NJ 07090
910 N Harbor Drive, San Diego CA 92101"""
results = re.findall(r", ([^,]*),? ([A-Z]{2}\b)", s)
print(results)
Output:
[('Westfield', 'NJ'), ('San Diego', 'CA')]
With the zip
function you can turn that into a sequence of cities and of zipcodes:
cities, zipcodes = zip(*re.findall(r", ([^,]*),? ([A-Z]{2}\b)", s))
print(cities)
print(zipcodes)
Output:
('Westfield', 'San Diego')
('NJ', 'CA')
When you deal with separate strings for each line, you could also use that same regex as follows:
import re
lst = [
"500 Rahway Avenue, Westfield, NJ 07090",
"910 N Harbor Drive, San Diego CA 92101"
]
for s in lst:
city, zipcode = re.search(r", ([^,]*),? ([A-Z]{2}\b)", s).groups()
print(city, zipcode)