Search code examples
pythonregexpython-2.7nonetype

Edit regex to recognise very short street name or number for Python 2.7 script and avoid fatal attribute error


I have a little script in Python 2.7 with regex that recognises when a street or avenue name is in a string. However it doesn't work for street names of only two letters (OK that's very rare) OR streets with numbers (e.g. 29th Street)... I'd like to edit it to make the latter work (e.g. recognise 29th street or 1st street).

This works:

import re

def street(search):
    if bool(re.search('(?i)street', search)):
        found = re.search('([A-Z]\S[a-z]+\s(?i)street)', search)
        found = found.group()
        return found.title()
    if bool(re.search('(?i)avenue', search)):
        found = re.search('([A-Z]\S[a-z]+\s(?i)avenue)', search)
        found = found.group()
        return found.title()
    else:
        found = "na"
        return found

userlocation = street("I live on Stackoverflow Street")

print userlocation

Or, with "Stackoverflow Avenue"

......but these fail:

userlocation = street("I live on SO Street")
userlocation = street("I live on 29th Street")
userlocation = street("I live on 1st Avenue")
userlocation = street("I live on SO Avenue")

with this error (because nothing found)

me@me:~/Documents/test$ python2.7 test_street.py
Traceback (most recent call last):
  File "test_street.py", line 12, in <module>
    userlocation = street("I live on 29th Street")
  File "test_street.py", line 6, in street
    found = found.group()
AttributeError: 'NoneType' object has no attribute 'group'

As well as correcting the query so that it recognises "1st", "2nd", "80th", "110th" etc., I'd also ideally like to avoid a fatal error if it doesn't find anything.


Solution

  • You can merge your two conditions inside your function, then you can match any non-space characters \S+ followed by a space and the keywords "Street" or "Avenue" (\s(Street|Avenue)).

    import re
    
    def street(search):
        if bool(re.search('(?i)(street|avenue)', search)):
            found = re.search('(?i)\S+\s(Street|Avenue)', search)
            found = found.group()
            return found.title()
        else:
            found = "na"
            return found
    
    print street("I live on Stackoverflow Street")
    print street("I live on SO street")
    print street("I live on 29th Street")
    print street("I live on 1st Avenue")
    print street("I live on SO Avenue")
    

    Output:

    Stackoverflow Street
    So Street
    29Th Street
    1St Avenue
    So Avenue
    

    This will match only the last word of the street, though if you want to match multi-worded streets and you are able to catch specific keywords that occur always before the street, then you may be able to catch your whole street name.

    Check the regex demo here.